-
Notifications
You must be signed in to change notification settings - Fork 1
Basic Format Conventions
In each datafolder, a settings.txt should be present that contains some basic metadata about the data such as the spatial extent (boundary) or total quantity.
Settings are structured by the name, followed by a colon and a single space and the content:
profilename: Area01
A settingsfile could look like this:
profilename: AC_S2E1
lat: 28.8667226311676
lng: -3.66452177963898
bottomlat: 18.0675865620531
leftlong: -13.7887118444338
toplat: 39.6658587002821
rightlong: 6.45966828515584
querytype: upload_time
minuploaddate: 1/1/2007
maxuploaddate: 1/1/2015
geoquery: bbox
bboxwidth: 2400
bboxlength: 1800
tags:
accuracy: 16
safesearch: 3
contenttype: 1
perpage: 250
sort: date-taken-desc
maxperfile: 50000
queryname: AC_S2E1
subgrid: True
tilesize: 300
subgridStart: 1
subgridEnd: 48
querysnooze: True
queryWaitTime: 15
queryAPICallWait: 1500
ClipGeoVersion: 0.9.1.14089
Currently, only the following settings are mandatory:
-
profilename (e.g. 'Area01')
-
bottomlat (e.g. '18.067')
-
leftlong (e.g. '-13.788')
-
toplat (e.g. '39.665')
-
rightlong (e.g. '6.459')
-
querytype (either 'upload_time' or 'date_taken')
- minuploaddate (e.g. '1/1/2007')
- maxuploaddate (e.g. 1/1/2015)
or - mintakendate (e.g. '1/1/2007')
- maxtakendate (e.g. 1/1/2015
-
maxperfile (e.g. '50000', max number of entries per datafile)
-
Profilename is used to display names for each dataextent on the map
-
bottomlat, leftlong, toplat, rightlong contain information on the spatial boundary of all points in all datafiles in each folder. The tool uses this information to exclude extents (i.e. datafolders), if they do not intersect with the selection area.
-
querytype, minuploaddate, maxuploaddate, mintakendate, maxtakendate timespan of the data
-
maxperfile maximum number of entries (= lines) in each datafile. If this number is present in the filename of datafiles, the tool builts an estimate of the total available data in the database
- File name concentions:
- if you want to include total number of entries per file for overall startistics, either use 'maxperfile' as part of your datafilename (e.g. mydatafile_50000_AreaXYZ.txt) or separate nameparts by '_' and use the last but one part for providing a datacount (e.g. mydatafile_4679_AreaXYZ.txt)
- in the above examples, the system is actually a bit redundant, since maxperfile isn't really necessary anymore
- don't use filenames for datafiles that contains 'settings' or start with 'log'; files named 'GridCoordinates.txt' will be ignored
- Content convention:
- UTF-8 formatted
- comma separated
- in entries, all commas are replaced by semicolon
- the first line should include the following headers: ID,Latitude,Longitude,NAME,URL,PhotoID,Owner,UserID,DateTaken,UploadDate,Views,Tags,MTags
- this list is currently optimized for the Flickr data
- all of the following lines each contain information on a single coordinate:
5,30.689483,-6.262623,Title of the picture,https://linkto.com/onlineimage.jpg,3385943542,Photographername,UserID,2/17/2009 15:53:14,3/25/2009 20:35:15,132,;africa;stone;mountains;nature;rock;valley;landscape;,geo:lat=3068948200 geo:lon=626262200
- ID (e.g. 5) is a continous growing number of type Long for each datafile
- Latitude & Longitude are the coordinates in Decimal degrees/ WGS1984 projection
- NAME is a bit misleading and refers to the title of the image
- URL is a direct link to an online image, which will be used for displaying a preview image when Photocollection-feature is used
- PhotoID is used to count entries and detect duplicates
- UserID is used to count photographers
- DateTaken and UploadDate can be used to clip data between temporal extents, the format is month/day/year hour:minute:second
- Views also only relevant for statistics
- Tags contain a list of tags, separated by semicolon