Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: handle non-local files with grace #2135

Open
1 of 5 tasks
SgtPooki opened this issue Jul 6, 2023 · 9 comments
Open
1 of 5 tasks

feat: handle non-local files with grace #2135

SgtPooki opened this issue Jul 6, 2023 · 9 comments
Labels
area/screen/files Issues related to Files screen effort/days Estimated to take multiple days, but less than a week exp/intermediate Prior experience is likely helpful kind/enhancement A net-new feature or improvement to an existing feature P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked

Comments

@SgtPooki
Copy link
Member

SgtPooki commented Jul 6, 2023

This task aims to resolve all of the issues in the Related items section below. The tasklist clarifies how we plan to accomplish this.

Tasks

  1. P3 area/screen/files effort/days kind/enhancement status/ready
  2. P3 area/screen/files effort/days exp/intermediate help wanted kind/enhancement status/ready

Some details about a method forward for this at ipfs/ipfs-desktop#2492 (comment)

Related items

@SgtPooki SgtPooki added the need/triage Needs initial labeling and prioritization label Jul 6, 2023
@SgtPooki SgtPooki self-assigned this Jul 6, 2023
@SgtPooki SgtPooki added exp/intermediate Prior experience is likely helpful P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked area/screen/files Issues related to Files screen kind/enhancement A net-new feature or improvement to an existing feature effort/days Estimated to take multiple days, but less than a week and removed need/triage Needs initial labeling and prioritization labels Jul 6, 2023
@SgtPooki SgtPooki removed their assignment Jul 6, 2023
@github-project-automation github-project-automation bot moved this to Needs Grooming in IPFS-GUI (PL EngRes) Jul 7, 2023
@SgtPooki SgtPooki moved this from Needs Grooming to Prioritized / Ready for Dev in IPFS-GUI (PL EngRes) Jul 10, 2023
@kt-wawro kt-wawro moved this to In progress in PLDG Cohort 0 Project Board Sep 3, 2024
@acul71
Copy link
Contributor

acul71 commented Sep 4, 2024

@SgtPooki @kt-wawro
Planning to work on this after closing #2255

@acul71
Copy link
Contributor

acul71 commented Sep 25, 2024

@SgtPooki @lidel
About:

  • feat(files): Users can now see whether files are local or remote before clicking them.

To determine if a file is local, I'm using the following commands:

ipfs block stat --offline CID   # ipfs.block.stat(cid, { offline: true })

I’m retrieving all blocks for the current directory and checking if they are local or not by using:

ipfs dag get CID   # ipfs.dag.get(f.cid)

I modified the dirStats function in actions.js. Initially, I retrieved all the links for the listed files and checked if they were local. However, this turned out to be time-consuming. So, empirically, I now check only the first link and assume:

  • If one link CID is found to be local, the file is marked as local.
  • If one link CID is offline, the file is marked as remote.
  • If the directory list exceeds 300 entries, I stop checking and mark it as undefined.
/**
 * @param {IPFSService} ipfs
 * @param {CID} cid
 * @param {Object} options
 * @param {string} options.path
 * @param {boolean} [options.isRoot]
 * @param {import('./utils').Sorting} options.sorting
 */
const dirStats = async (ipfs, cid, { path, isRoot, sorting }) => {
  const entries = await all(ipfs.ls(cid)) || []
  // Workarounds regression in IPFS HTTP Client that causes
  // ls on empty dir to return list with that dir only.
  // @see https://github.com/ipfs/js-ipfs/issues/3566
  const res = (entries.length === 1 && entries[0].cid.toString() === cid.toString())
    ? []
    : entries
  const files = []

  // precaution: there was a historical performance issue when too many dirs were present
  let dirCount = 0

  for (const f of res) {
    const absPath = join(path, f.name)
    let file = null

    if (dirCount < 1000 && (f.type === 'directory' || f.type === 'dir')) {
      dirCount += 1
      file = fileFromStats({ ...await stat(ipfs, f.cid), path: absPath })
    } else {
      file = fileFromStats({ ...f, path: absPath })
    }

    // Check if file is local
    const refs = await ipfs.dag.get(f.cid)
    // Extracting all CIDs
    const refsCids = refs.value.Links.map(link => link.Hash['/'])
    // console.log(refsCids)
    // console.log(`refs=${JSON.stringify(refs, null, 2)}`)
    if (dirCount > 300) {
      file.isLocal = undefined
    } else
    if (refsCids.length === 0) {
      file.isLocal = true
    } else {
      file.isLocal = true
      // for (const myCid of refsCids) {
      const myCid = refsCids[0]
      const cid = CID.decode(myCid)
      console.log(cid)
      try {
        // console.log(`myCid=${myCid} cid=${cid.toString()}`)
        // const res = await ipfs.block.stat(cid, { withLocal: true, offline: true })
        const res = await ipfs.block.stat(cid, { offline: true })
        console.log(`myCid=${cid}, res=${JSON.stringify(res, null, 2)}`)
        // break
      } catch (error) {
        file.isLocal = false
        console.error(`Error fetching stats for cid=${cid}:`, error)
        // break
      }
    }
    files.push(file)
  }

The UI looks like this:

  • No icon for local files.
  • A cloud symbol for remote files.
  • A question mark symbol for files with an unknown state.
    { isLocal === undefined ? '❔' : !isLocal && '☁️' }

image

PS: Try the Project Apollo Archives and XKCD archives
Thoughts?

@acul71
Copy link
Contributor

acul71 commented Sep 25, 2024

@SgtPooki @lidel
About

  • feat(files): when offline, files/folders are not clickable if the user would not be able to view the content/children

I wanted to check whether the browser is offline or online in JavaScript by using the navigator.onLine property, which returns a boolean value
How ever not all browser are compliant like safari on mac.
So I after browsing for info on forums, I thought to add an additional check, let's say once every 60 secs getting google icon to see if we're really on line.

const [isOnline, setIsOnline] = useState(navigator.onLine)

  const CHECK_INTERNET_STATUS = 60 * 1000

  function isGoogleOnline () {
    return new Promise((resolve) => {
      const img = document.createElement('img')
      img.onerror = () => resolve(false)
      img.onload = () => resolve(true)
      img.src = 'https://www.google.com/favicon.ico?_=' + new Date().getTime()
    })
  }

  // Function to create a delay
  const delay = (ms) => new Promise(resolve => setTimeout(resolve, ms))

  // Function to check status and log the result
  async function checkInternetStatus () {
    let onlineStatus = navigator.onLine
    if (onlineStatus) {
      console.log('You seems online! Checking Google...')
      // Wait for 3 seconds
      await delay(3000)
      const googleOnline = await isGoogleOnline()
      console.log('Google is', googleOnline ? 'online' : 'offline')
      onlineStatus = googleOnline
    } else {
      console.log('No internet connection.')
    }
    setIsOnline(onlineStatus)
  }

  // Use useEffect to run the check once on component mount and set up an interval
  useEffect(() => {
    // Event listeners for online and offline status
    const handleOnline = () => checkInternetStatus()
    const handleOffline = () => setIsOnline(false)

    window.addEventListener('online', handleOnline)
    window.addEventListener('offline', handleOffline)

    // Check status immediately on page load
    checkInternetStatus()

    // Set up the interval to check status every 15 seconds
    const intervalId = setInterval(checkInternetStatus, CHECK_INTERNET_STATUS)

    // Clean up the interval and event listeners when the component unmounts
    return () => {
      clearInterval(intervalId)
      window.removeEventListener('online', handleOnline)
      window.removeEventListener('offline', handleOffline)
    }
  }, []) // Empty dependency array to run only once on mount

If we're offline I'll apply this class to shade/unclick the file view

.greyed-out {
  color: grey;
  pointer-events: none; /* This will prevent clicking */
  opacity: 0.5;
}

Should I choose only to use navigator.onLine property or use also the get image workaround, but I'm afraid in some cases it could fail even if we're not offline (could it happen a captcha ?) and shade the file content

Thoughts?

@SgtPooki
Copy link
Member Author

hey @acul71, thanks for the ping. We shouldn't include a dependency on a service like google.. I think a simple check for a 'hello world' CID should be sufficient for a backup online check? such as a fetch for bafkqae2xmvwgg33nmuqhi3zajfiemuzahiwss or similar.

If we query the local kubo node it could resolve immediately (due to being an inline CID) even if we're not actually online, so i'm thinking maybe a light fetch request to trustless-gateway.link?

image
const url = 'https://trustless-gateway.link/ipfs/bafkqae2xmvwgg33nmuqhi3zajfiemuzahiwss?format=raw';

async function headRequest(url) {
  try {
    const response = await fetch(url, {
      method: 'HEAD',
    });

    if (!response.ok) {
      throw new Error(`HTTP error! status: ${response.status}`);
    }

    // Log status code
    console.log('Status Code:', response.status);
  } catch (error) {
    console.error('Error making HEAD request:', error);
  }
}

headRequest(url);

@acul71
Copy link
Contributor

acul71 commented Sep 30, 2024

@SgtPooki @lidel
About

  • feat(files): users can replicate files locally with an action

I'm planning to modify doFilesFetch that handles the procedure.
After the command const cid = await ipfs.files.cp(srcPath, dst) I'm doing ipfs.refs(srcPath, { recursive: true }) to actually get all the blocks. Is this the way or ?
I saw that for large content It can take a long time to complete and no visual status (except console messages for the moment) is given

/**
  * Adds file under the `src` path to the given `root` path. On completion will
  * trigger `doFilesFetch` to update the state.
  * @param {string} root
  * @param {string} src
  * @param {string} name
  */
 doFilesAddPath: (root, src, name = '', downloadLocally = false) => perform(ACTIONS.ADD_BY_PATH, async (ipfs, { store }) => {
   ensureMFS(store)

   const path = realMfsPath(src)
   const cid = /** @type {string} */(path.split('/').pop())

   if (!name) {
     name = cid
   }

   const dst = realMfsPath(join(root, name))
   const srcPath = src.startsWith('/') ? src : `/ipfs/${cid}`

   try {
     console.log(`Adding ipfs.files.cp(${srcPath}, ${dst}) downloadLocally=${downloadLocally}`)
     const cid = await ipfs.files.cp(srcPath, dst)
     console.log(`Adding ipfs.files.cp(${srcPath}, ${dst}) = ${cid}`)

     // ipfs refs --recursive --help
     if (downloadLocally === true) {
       for await (const ref of ipfs.refs(srcPath, { recursive: true })) {
         if (ref.err) {
           console.error('refs: ', ref.err)
         } else {
           console.log('refs: ', ref)
           // output: "QmHash"
         }
       }
     }
     return cid
   } finally {
     await store.doFilesFetch()
   }
 }),

@acul71
Copy link
Contributor

acul71 commented Oct 3, 2024

@SgtPooki @lidel About:

  • feat(files): Users can now see whether files are local or remote before clicking them.

To determine if a file is local, I'm using the following commands:

ipfs block stat --offline CID   # ipfs.block.stat(cid, { offline: true })

I’m retrieving all blocks for the current directory and checking if they are local or not by using:

ipfs dag get CID   # ipfs.dag.get(f.cid)

I modified the dirStats function in actions.js. Initially, I retrieved all the links for the listed files and checked if they were local. However, this turned out to be time-consuming. So, empirically, I now check only the first link and assume:

  • If one link CID is found to be local, the file is marked as local.
  • If one link CID is offline, the file is marked as remote.
  • If the directory list exceeds 300 entries, I stop checking and mark it as undefined.

..........

lidel (from Slack)
Monday at 4:58 PM
i think heuristic and proposed UX sounds sensible, you may run into trouble with big HAMT-sharded directories but people dont open such big directories with webui anyway.
something you may explore is checking directory in a streaming fashion, or maybe even improving kubo RPC.
what you can do today in CLI (i dont remember if kubo-rpc-client in JS supports this in streaming fashion):
ipfs ls --stream --size=false --resolve-type=false --offline /ipns/en.wikipedia-on-ipfs.org/wiki/ will list content of directory without fetching root blocks of children (and you will get Error if you dont have enough blocks to enumerate directory
ipfs ls --stream --size=true --resolve-type=true --offline /ipns/en.wikipedia-on-ipfs.org/wiki/ will also check size and type of every child, triggering its fetch, so if any of children is not local, you will get error like
Error: block was not found locally (offline): ipld: could not find bafybeihn2f7lhumh4grizksi2fl233cyszqadkn424ptjajfenykpsaiw4
What we could add to Kubo RPC, is --is-local=true|false which would return additional field for every item in directory informing you if the root is local or not.
This way we could do ipfs ls --stream --size=false --resolve-type=false --is-local=true /ipns/en.wikipedia-on-ipfs.org/wiki/ (using new flag instead of --offline and fetching all roots ) and this way it would not error on first missing block, but enumerate directory without doing extra work.

React

4:59
anyway, maybe open a PR or issue and we can discuss there? feels like useful feature, but we also should not add extra 1-2 requests per file
5:00
in my mind, locality check is a niche feature, most of users will have data locally, no need for them to pay penalty of those additional checks , so better to add locality status to RPC, just like we get file size i guess?

##############################

@lidel @SgtPooki
I've been trying to profile the added code I find out that this is the time added by the additional check:

time_X                  43 files    0.152 seconds
time_Apollo_albums.txt  105 files   0.422 seconds
time_XKCD_Archive.txt   ~1800 files  1.429 seconds

Maybe what I can do is submit this solution in the PR and open an issue or feat request in kubo, to let this info (local or not) come back with ipfs.files.stat.
It seems to me reasonable to wait some fraction of a second in average to get the info. And then when the kubo feat will be implemented, change the code accordingly.
Let me know, what to do..

/**
 * @param {IPFSService} ipfs
 * @param {CID} cid
 * @param {Object} options
 * @param {string} options.path
 * @param {boolean} [options.isRoot]
 * @param {import('./utils').Sorting} options.sorting
 */
const dirStats = async (ipfs, cid, { path, isRoot, sorting }) => {
  const entries = await all(ipfs.ls(cid)) || []
  // Workarounds regression in IPFS HTTP Client that causes
  // ls on empty dir to return list with that dir only.
  // @see https://github.com/ipfs/js-ipfs/issues/3566
  const res = (entries.length === 1 && entries[0].cid.toString() === cid.toString())
    ? []
    : entries
  const files = []

  // precaution: there was a historical performance issue when too many dirs were present
  let dirCount = 0
  // let fileCount = 0

  let totDuration = 0
  for (const f of res) {
    const absPath = join(path, f.name)
    let file = null

    if (dirCount < 1000 && (f.type === 'directory' || f.type === 'dir')) {
      dirCount += 1
      file = fileFromStats({ ...await stat(ipfs, f.cid), path: absPath })
    } else {
      file = fileFromStats({ ...f, path: absPath })
    }

    // Start timing
    const startTime = performance.now()
    // If dirCount > 300 stop checking and mark isLocal as undefined
    if (dirCount > 300) {
    // if (fileCount > 300) {
      file.isLocal = undefined
    } else {
      // fileCount += 1
      // Check if file is local
      // TODO: See if # ipfs ls --size=false --resolve-type=false CID is faster then ipfs.dag.get CID
      const refs = await ipfs.dag.get(f.cid)
      // Extracting all CIDs
      // const refsCids = refs.value.Links.map(link => link.Hash['/'])
      const firstRefCid = refs.value.Links.length > 0 ? refs.value.Links[0].Hash['/'] : ''

      // console.log(refsCids)
      // console.log(`refs=${JSON.stringify(refs, null, 2)}`)
      // if (refsCids.length === 0) {
      if (firstRefCid === '') {
        file.isLocal = true
      } else {
        file.isLocal = true
        // for (const myCid of refsCids) {
        // const myCid = refsCids[0]
        // const cid = CID.decode(myCid)
        const cid = CID.decode(firstRefCid)
        // console.log(cid) // debug lucas
        try {
          // console.log(`myCid=${myCid} cid=${cid.toString()}`)
          // const res = await ipfs.block.stat(cid, { offline: true })
          await ipfs.block.stat(cid, { offline: true })
          // console.log(`myCid=${cid}, res=${JSON.stringify(res, null, 2)}`)
          // break
        } catch (error) {
          file.isLocal = false
          console.error(`Error fetching stats for cid=${cid}:`, error)
          // break
        }
      }
    }
    // End timing and log the duration
    const endTime = performance.now()
    // const duration = (endTime - startTime) / 1000 // Convert to seconds
    const duration = (endTime - startTime)
    totDuration += duration

    files.push(file)
  }
  console.log(`Time taken to check if file is local: ${totDuration / 1000} secs`)

  let parent = null

  if (!isRoot) {
    const parentPath = dirname(path)
    const parentInfo = infoFromPath(parentPath, false)

    if (parentInfo && (parentInfo.isMfs || !parentInfo.isRoot)) {
      const realPath = parentInfo.realPath

      if (realPath && realPath.startsWith('/ipns')) {
        parentInfo.realPath = await last(ipfs.name.resolve(parentInfo.realPath))
      }

      parent = fileFromStats({
        ...await stat(ipfs, parentInfo.realPath),
        path: parentInfo.path,
        name: '..',
        isParent: true
      })
    }
  }

  return {
    path,
    fetched: Date.now(),
    type: 'directory',
    cid,
    upper: parent,
    content: sortFiles(files, sorting)
  }
}

@acul71
Copy link
Contributor

acul71 commented Oct 4, 2024

@lidel @SgtPooki
About:

image

What is the exact meaning of files, pins, blocks, local repo ?

Right know It's like this:
image

I assume that:

  • local repo is the same of ALL BLOCKS ("Total size of blocks on your entire IPFS node; this includes everything in Files, plus all locally pinned items and any temporary cached data")
  • files is the same of FILES ("Total size of data in the current directory (if a subdirectory, the size of all data in Files is also displayed)")
  • pins: What is exactly (The number of pinned CIDs?)
  • blocks: What is exactly (The number of all blocks from the root ?)

About the expandable thingy (▶) About this numbers:
Right now there is a tool-tip when you do a mouse over the label:
image
Should I have to keep this approach (info on mouse over) and/or add the expandable thingy (▶) ?

@SgtPooki
Copy link
Member Author

SgtPooki commented Nov 14, 2024

What is the exact meaning of files, pins, blocks, local repo ?

Name Old name Meaning
files all blocks files The size of all the files including local and remote
pins n/a The number of items pinned
blocks n/a The number of total blocks in the MFS store
local repo files all blocks The size of all blocks actually stored in your kubo blockstore

Should I have to keep this approach (info on mouse over) and/or add the expandable thingy (▶) ?

I believe we want to add the expandable section so we can provide more information. It may not look great when expanded though, so we might need to pivot. However, let's start there

@lidel
Copy link
Member

lidel commented Nov 14, 2024

What is the exact meaning of files, pins, blocks, local repo ?

@acul71 the issue #1248 you linked is from 2019 and is how old UI looked like. It was simplified since then (removed noise like pins and blocks).

I wrote down my understanding of the current state of that header (and the sources of values) in #1248 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/screen/files Issues related to Files screen effort/days Estimated to take multiple days, but less than a week exp/intermediate Prior experience is likely helpful kind/enhancement A net-new feature or improvement to an existing feature P2 Medium: Good to have, but can wait until someone steps up status/ready Ready to be worked
Projects
No open projects
Status: Prioritized / Ready for Dev
Development

No branches or pull requests

3 participants