Before we start: Github sometimes automatically adjusts the size of the items on the website. If you use an 11''-screen we recommend a zoom of 67%, otherwise it should look allright. To go to the Github-Repo click on the About-Tab in the menu bar.

Geographics

We will now try to take a look at the following three topics. Electric vehicle distribution (absolute and relative), Income and politcal orientation.

Total Amount of Cars

Lets initially get an overview of how electric vehicles distribute in the state of California. With 2987 Electric vehicles is 94539 the ZCTA which has the most EV illustrated with a red dot on the map, on the other hand multiple ZCTA has no electriv vehicles registred.

As there is a large difference inbetween the amount of cars pr ZCTA it's chosen to plot the heat scale in the interval {0,max(EV)25%}

To the left is seen a map where the total number of EV pr ZCTA is illustrated, looking briefly at the map one can quickly observe, that the amount of ev is greater around the larger cities as Los Angeles and San Fransisco. If one zooms in on San Fransico there is also a larger proportion of cars in the southern part of San Fransico bay, respectively the area of Santa Clara. This area is also commonly known as the richest County of California.

But just inspecting the total amount of cars can be a bit misleading, as the plot showed there were more EV around the larger cities, but this is also where the most people live, so lets instead look at how large a fraction EV makes in the total car population.

Fraction of Electric Vehicles

Now when we inspect the fraction level we still see a pattern of higher densities around larger cities. But the suburbs now also seem to play a more significant role. Inspecting the data a bit further we also see that the "leader" of EV has changed from 94539, which has the highest absolute value of electric vehicles, but only a fraction of 7.4% to 94027 (indicated by the red dot on the map) which has an electric vehicle share of 10.4%. But again as with the illustration of abosolut amount of EV, the following conclusion can be drawn, that most electric vehicles are located, with urban areas.

Map of electoral data

Another interesting aspect of the car ownership could be to look at how the cars are distributed depending on the political orientation of a certain geographical area. The Democrates are known to be more proactive of green energy and electric vehicles than the Republicans. But lets try to take a look at the data.

Just as the distribution of cars, there is again a clear distinction between city areas and country sides. The city areas are mostly Democraticly dominated, where the country sides have a majority of Republicans.

If we try to compare the ZCTA which has a majority of democrates against the ZCTA which has a majority of republicans. We see that the average fraction of EV in republican ZCTA is: 1.46% And the average fraction of EV in Democratic ZCTA is: 0.48%

Comparing these two with a t-test we get a p value of: 4.01*10^{-40}

This clearly indicates that the two distributions are statistically different. One should though be careful in the conclusions that just because one is republican there is a smaller chance of owning an electric vehicle. If we look back at the map we could see that people living in larger cities in general had more electric vehicles, this could go hand in hand with the fact of people living in the cities in general drive shorter distances, and also remembering that people

Map of Income Data

As mentioned before certain areas of California are known to be extremely wealthy especially the southern part of San Fransisco bay, which contain areas as silicon valley. We can now again illustrate the income as spatial data in order to see which areas has greater income. The median income level is chosen as it is a more adequate measurement for income compared to mean value, as outliers would heavily affect these.

Not surprisingly the richest areas are again south of california and north of Los angeles, we observe the richest area is ZCTA: 94027 (indicated by the red dot on the map). This ZCTA code lays as a suburb of San Fransisco, and is within the same area of where we before observed a large fraction of electric vehicles, and in fact this was also the ZCTA containing the largest relative proportion of EV. A plot illustrating the relationship of income and EV can be seen further down the page.

In the last plot of this section the reader can him or herself interact with the map and check income and amount of EV for specific ZTCA

Compare the data directly

Below one can use the maps and easily compare different ZCTA codes to check relationships between income, political orientation, and electric vehicles.

Household income and EVs

In above histogram two different plots are illustrated, amount of ZCTA that has a certain income (first y-axis) and amount of EV pr income aggregated across the ZCTA (second y-axis). Looking at this we clearly observe that the majority of the electric vehicles are located within the richer ZCTA. We can see this as the electric vehicle car distribution layes further to the right than the income distribution. We also observe that the income looks like a skew normal distribution, which is indicated by the mode lying left of the median. The electric vehicle distribution seem to follow the same behaviour.

It is also noticable that there is some noise occuring at the furthest to the right bin, as the data goes from {0,250.000+} there seem to be an increase here, but in reality the figure might actually just flatten further out.

The axes of vehicle ownership

In the following we will be working with a dataset, called the National Household Travel Survey [2]. This dataset contains a great number of variables. The information include income, occupation, living situation and travelling habits. We can describe the habits and variables of electric vehicle owners compared to general vehicle owners. It is hard to keep track of all those variables though, and humans can only see 3 dimensions, so in order to make the data more accessible, we revert to a technique called Principal Component Analysis. This allows us to find the axes along which the dataset has most variance, thus making them the most central factors of the dataset. To read more about this see for example [3]. Let's inspect the significant variables in the first three principal components so that we get a feeling for what they represent:
We can describe component 1 as an occupation vector, vector 2 seems to be a combination of transportation habits, especially public transport. Vector 3 contains variables on electric or hybrid car ownership, paired with information on education and age. We can continue evaluating the other variables that way. In this case there's the 10 principal components, who account in total for around 40% of the variance in the dataset. This might not seem like a lot but keep in mind that in this dataset there's a total of 89 principal components. We plot the electric vehicle owners on these axes to see if they look different than the distribution of regular vehicle owners (which we suspect). For this we choose a jitter plot, so it is just the value on the principal component axes, with some noise added to make it visual.
We can start to see some interesting patterns. The forst component for example seems to have two classes, although electric and non-electric vehicles are equally present among both classes. For component 2, the classes of electric and non-electric vehicle owners are a bit separated, although overlapping, The EV-owners seem to generally have a higher value of this component, indicating that they live in more urban areas and use public transport more (which is of course quite correlated). Principal component three seems to entirely separate the data, but here we need to be careful. This component contain, among others the separator "alternative fuel". So this will of course have this separation. The reader is welcome to continue this data exploration, using the slideshows presented. Furthermore there's an attempt at a RandomForest Classifier, which we deemed not having relevant enough results, so it can be found in the About-Section.