Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArcGIS to CSV data issues #13

Open
theo-armour opened this issue Nov 16, 2013 · 2 comments
Open

ArcGIS to CSV data issues #13

theo-armour opened this issue Nov 16, 2013 · 2 comments

Comments

@theo-armour
Copy link
Member

Viewing large CSV files

I use NotePad++ as a text editor. One feature is that it loads files up to 100 megabytes or so in a flash. If anybody is having issues, I can add an online CSV data file viewer that is fairly fast.

Data Errors

I note that some records start with series of '0,0,0,0,0,...' Currently these are being filtered out, but it would be nice to see what is causing this to happen.

Some records - see Brooklyn - have no street numbers and many polygons. So these are either streets or entire blocks or whatever. Is there any way of filtering these out in the export process?

Three.js is also reporting 'duplicate data' issues on some records. I will double click into this particular eventually.

Currently, Manhattan is the only borough where I can complete the 'Display Borough' successfully. It would be nice to do better than this - but displaying entire boroughs is not a priority for the time being. Right?

Data Reduction

An essence of working online: keep file size to a minimum.

Here are ways we could reduce CSV file sizes:

CSV files have X,Y,Z data with Z always 0.0. We don't need the Z.
CSV files have a space after the comma. We don't need the space.
CSV files have six places after the decimal point. We probably only need one or two.
CSV files have a lot of data regarding areas etc. We probably don't need this data for basic insolation reports. It might be better placed in separate files.

ArcGIS to Lat/Lon

It appears that the NYC data is X and Y feet from some datum point. Is there any way of converting these positions to lat/lon? If so then we can easily slip OpenStreet or Google Maps data under every rendering plus weather conditions and more. ;-)

ArcGIS export

I have not looked into this yet - but will - before too long - look into the app Mostapha pointed to. I look forward to adding San Francisco and other cities - and to finding ways of reducing file sizes.

@mostaphaRoudsari
Copy link
Contributor

Data Reduction

I cleaned Manhattan file (saved as Manhattan_min). No space and no Z. There is only 3 decimals after the number. I kept the areas as they will be useful for energy simulation. Let me know if it works and I will update the rest.

I'll check for the rest as soon as I get a chance. Thank you for the great work Theo. :)

@theo-armour
Copy link
Member Author

Mostapha

File Sizes

The 25% file size reduction in Manhattan is great. Ditto the others. Thank you!

Unfortunately Brooklyn is still 55 MB. Virtually impossible to deal with at the moment.

Before too long I plan to create some sort of smaller CSV index file that just has the street address, the first X and Y and the start byte of the position in the large file.

So you can leave the boroughs that are not Manhattan and, say, the Bronx aside for the moment.

Repetition of first point as the last point

Know repetitions are really not needed. These can be easily dealt with programmatically

Issues with the data << the most important thing

Please open the min Manhattan CSV.

Look at line 5, 'Pier 6', position 14. Note the value is 24346981182,
Now look at line 6, 'Peter Minuet Plaza'. In the same position you will see a much smaller number: 0980517.

You will see similar differences throughout the file. I believe that there should be a comma before the '98' (or the equivalent) in these positions throughout.

This conjunction of the two values is probably losing us a lot of data. So your look at this would be really useful.

Again thanks for the updates.

Theo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants