Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download only content between a date range #939

Closed
mrsilver76 opened this issue Aug 23, 2024 · 7 comments
Closed

Download only content between a date range #939

mrsilver76 opened this issue Aug 23, 2024 · 7 comments

Comments

@mrsilver76
Copy link

mrsilver76 commented Aug 23, 2024

Summary

It would be nice if we can define a date range and only content taken on or between those dates are downloaded.

One way would be to ask the user to provide "to" and "from" dates - eg. --date-range 2023-01-01-2023-12-31 would download anything taken between 1st Jan 2023 and 31st Dec 2023 (in other words, anything in 2023).

Another way would be to have two separate arguments that limit downloads. eg --download-after 2023-01-01 and --download-before 2023-12-31. They can be used singularly but when used together can define a date range.

Context

I would like to be able to download my content in batches based on the date the content was taken. To date, my only options are to download either the most recent X items or download the whole lot.

@mrsilver76
Copy link
Author

Looks like this feature request is covered by #779 so I will close this request.

@AndreyNikiforov
Copy link
Collaborator

I would appreciate your describing the need to download in batches further.

What do you do with images in the batch after downloading? Do you delete them in icloud or process locally and delete locally? Do batches help limit amount of work you are doing with images or help no break the limit in local storage? -- just some questions to poke into your need.

@mrsilver76
Copy link
Author

Not a problem :)

In my case, I've run out of iCloud storage and (because I'm tight) I'm in the process of moving approximately 50,000 photos from iCloud to our NAS. At the same time, I'm attempting to group photos that both my partner and I have taken with our iPhones at the same place/time so that we have a complete set of memories.

I'd like to batch download for a couple of reasons:

  1. icloud.com is limited to 1000 per download, so I don't want to be doing 50 of these. In addition, actually trying to select 1000 items is much harder than it seems because of the UI.
  2. A 200GB download will take a very long time to run using icloudpd. I'd have to set it up overnight and if it fails then I'll be asleep.
  3. If I did download the 200GB, I'd have to store them on the NAS. Whilst that isn't too much of a problem, I'd get a better experience sorting, deleting, viewing and editing files on the laptop SSD than I will on the NAS - unfortunately my laptop SSD doesn't have enough storage space!
  4. Even if I did have a large enough SSD, realistically, there isn't much value in me having the whole 200GB in one fell swoop. My plan was to attack the photos one year at a time and, given that, it'll probably take me a good six months to work my way through them all.

Does that help at all? More than happy to provide further details/insight if needed!

(side note: i've actually written a small c# program, that I plan to put on Github, that takes the photos and attempts to logically group them into folders based on the date they were taken and their location - so this is hopefully going to reduce my workload)

@AndreyNikiforov
Copy link
Collaborator

I see two problems that you are trying to solve:

  1. reduce storage used in iCloud.com; icloudpd has --delete-after-download to keep iCLoud.com clean and there are requests to make that functionality a bit more flexible/configurable
  2. organize assets in local storage; icloudpd supports basic org by asset date with --folder-structure parameter; I assume that hierarchy mostly helps with speed of file access by other software, but I never tested that assumption. There are a lot of opportunities in organizing photos (face recognition, location etc) that other software does well and I don't see icloudpd tackling them in near future (if ever at all). AFAIK most of photo organization software does such organization "virtually" - they present your collection in a new way without moving/touching files in your storage. If using photo org software is the path you see yourself using at the end, you may not need to invest in reorganizing photo files now... SSD vs NAS will also heavily depend on photo org software behavior, e.g. it may, hypothetically, run backend on NAS and use SSD cache in frontend on laptop

Please keep posted about your decisions and progress -- it will help others: a) directly when they follow your path or b) indirectly as icloudpd future functionality is forming

@mrsilver76
Copy link
Author

That's a good description of the problems. For number 2, I've already got that covered with various other tools.

For number 1, I've decided that I really don't want to be downloading 200GB of images in one go. So, although I've never written anything in Python, I'm going to see if I can use the pull request written by itwasabhi to hardcode in a year limiter.

@mrsilver76
Copy link
Author

I think the following hack will work in download_photo in base.py:

            try:
                created_date = photo.created.astimezone(get_localzone())
            except (ValueError, OSError):
                logger.error(
                    "Could not convert photo created date to local timezone (%s)", photo.created
                )
                created_date = photo.created

            # Hardcoded hack to see if photo was taken in 2008
            if created_date.year != 2008:
                logger.debug("Skipping %s, as not taken in 2008", photo.filename)
                return False
            # End of hardcoded hack

            if folder_structure.lower() == "none":
                date_path = ""

What do you think? I probably should work out how to define a constant in Python :)

@AndreyNikiforov
Copy link
Collaborator

I think the following hack will work in download_photo in base.py:

            try:
                created_date = photo.created.astimezone(get_localzone())
            except (ValueError, OSError):
                logger.error(
                    "Could not convert photo created date to local timezone (%s)", photo.created
                )
                created_date = photo.created

            # Hardcoded hack to see if photo was taken in 2008
            if created_date.year != 2008:
                logger.debug("Skipping %s, as not taken in 2008", photo.filename)
                return False
            # End of hardcoded hack

            if folder_structure.lower() == "none":
                date_path = ""

What do you think? I probably should work out how to define a constant in Python :)

Looks reasonable to me. As long as you are not using --delete-after-download flag the only risk of screwing is the internet traffic used for photos from 2008. If traffic is unlimited for you, then it is wall time...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants