Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need for data imputation #3

Open
dataknut opened this issue Jun 28, 2019 · 1 comment
Open

need for data imputation #3

dataknut opened this issue Jun 28, 2019 · 1 comment
Assignees
Labels

Comments

@dataknut
Copy link
Member

We currently only have observations for when EVs are being driven or are charging. This means there is a lot of 'missing' data when it comes to calculating 'population' average kW charging demand etc. We need to impute missing 1 minute observations (given them 0 kW charging) and re-calculate sample means. Obviously we can't impute state_of_charge - although we cold assume some decay curve between the last and next real observation?

@dataknut dataknut self-assigned this Jun 28, 2019
dataknut referenced this issue Jun 28, 2019
…t charge levels (power demand) and charging times; corrected interp of 0 charging & renamed 'fast' to 'rapid' throughout (can't remember why)
@dataknut
Copy link
Member Author

@raffertyparker comments: "To accurately analyse time-averaged data:

Many opportunities for further research involve using mean values over time. What we currently have is mean values of the instances during which data was being collected, but it appears most of the times that the vehicles were not either driving or charging, no data was collected. This makes the time-averaged plots both messy and practically useless. As an example, in the plots of daily charging demand (deleted from main report) no weighting is given to the fact that charging is occurring more frequently at certain times than others.

The data is quite sporadic (data from different vehicles have different start dates and times, and data not sent at consistent time intervals). As far as I understand, in order to get true mean values over time we need to consider the zero values where no charging or driving was occurring. For this, I presume we need to create an "empty" dataframe for each individual vehicle that consists of a datetime column, and then columns of zeros for each variable we want to find the mean of, running the duration of each vehicle's data collection period at predetermined intervals (15 mins should be sufficient). To this we would "add" the original data that falls within each within each time interval, allowing true mean values to be found from the new dataframe. This would assume no missing non-zero data during the dates between which each vehicle has data collected."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant