Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

While walking, InverseTransform is invoked for directories - is this a bug? #76

Open
RobHoman opened this issue Aug 16, 2022 · 3 comments

Comments

@RobHoman
Copy link

RobHoman commented Aug 16, 2022

Hey @peterbourgon - loving this library and thank you for your stewardship!

Would love to have you take a look at the following lines:

diskv/diskv.go

Lines 597 to 601 in 2566386

key := d.InverseTransform(pathKey)
if info.IsDir() || !strings.HasPrefix(key, prefix) {
return nil // "pass"
}

By my reading, it would be preferable not to call InverseTransform for directories. In my case, I'm using the AdvancedTransform and its inverse in the README, and this is tickling the "panic" line because directories do not carry the expected extension.

What about switching the logic to this? In other words, pass on the directory BEFORE calling InverseTransform.

if info.IsDir() {
    return nil // "pass"
}

key := d.InverseTransform(pathKey)

if !strings.HasPrefix(key, prefix) {
    return nil // "pass"
}
@peterbourgon
Copy link
Owner

My intuition is that the bug is in the example, not the code. WDYT?

@RobHoman
Copy link
Author

RobHoman commented Aug 17, 2022

Perhaps I'm missing something, but as it stands I think I stand by my initial intuition.

I'll attempt to elaborate in hopes that this clarifies my interpretation of functionality.

Here is an example program:

package main

import (
  "fmt"
  "log"
  "path/filepath"
  "strings"

  "github.com/peterbourgon/diskv/v3"
)

func main() {
  const (
    bbdiskvExtension = ".bbdiskv"
  )

  kvs := diskv.New(
    diskv.Options{
      BasePath: "base",
      AdvancedTransform: func(key string) *diskv.PathKey {
        path := strings.Split(key, "/")
        last := len(path) - 1
        retval := &diskv.PathKey{
          Path:     path[:last],
          FileName: path[last] + bbdiskvExtension,
        }
        fmt.Printf("DEBG Converting Key '%s'\n", key)
        fmt.Printf("DEBG Returning PathKey '%#v'\n", retval)
        return retval
      },

      In package fmt m: func(pathKey *diskv.PathKey) (key string) {
        fmt.Printf("DEBG Inversing PathKey %#v\n", pathKey)
        extension := filepath.Ext(pathKey.FileName)
        if extension != bbdiskvExtension {
          fmt.Printf("DEBG - does not have extension: %#v\n", pathKey)
          return ""
        }
        retval := strings.Join(pathKey.Path, "/") + "/" + pathKey.FileName[:len(pathKey.FileName)-len(bbdiskvExtension)]
        fmt.Printf("DEBG Returning string: %#v\n", retval)
        return retval
      },
      CacheSizeMax: 1024 * 1024,
    },
  )

  type writeArgs struct {
    Key   string
    Value string
  }
  writes := []writeArgs{
    {"a/b", "val-for-a/b"},
  }

  fmt.Println("Writing keys: ")
  for _, write := range writes {
    fmt.Printf("- write key '%s' with value '%s'\n", write.Key, write.Value)
    err := kvs.Write(write.Key, []byte(write.Value))
    if err != nil {
      log.Fatalf("failed to write: %v", err)
    }
  }

  fmt.Println("Listing keys: ")
  keyC := kvs.Keys(nil)
  for key := range keyC {
    fmt.Printf("- key '%s'\n", key)
  }
}

Here is the output:

Writing keys:
- write key 'a/b' with value 'val-for-a/b'
DEBG Converting Key 'a/b'
DEBG Returning PathKey '&diskv.PathKey{Path:[]string{"a"}, FileName:"b.bbdiskv", originalKey:""}'
Listing keys:
DEBG Inversing PathKey &diskv.PathKey{Path:[]string{}, FileName:".", originalKey:""}
DEBG - does not have extension: &diskv.PathKey{Path:[]string{}, FileName:".", originalKey:""}
DEBG Inversing PathKey &diskv.PathKey{Path:[]string{}, FileName:"a", originalKey:""}
DEBG - does not have extension: &diskv.PathKey{Path:[]string{}, FileName:"a", originalKey:""}
DEBG Inversing PathKey &diskv.PathKey{Path:[]string{"a"}, FileName:"b.bbdiskv", originalKey:""}
DEBG Returning string: "a/b"
- key 'a/b'

By my understanding, the InverseTransform is being called back to analyze the root directory "." and also the directory a/. But these are not legitimate outputs of the AdvancedTransform function in the first place, and so there is no well defined answer that InverseTransform can give. Furthermore, I don't see why the InverseTransform callback should even be consulted by the walker function if the decision in the following line is to skip the item because info.IsDir() is true.

Another way of looking at it is that it seems throughout the code that there is an intended invariant that directories on the underlying disk never represent an actual key in the store, and I think that invariant implies that the InverseTransform callback should not be asked to ever look at such non-keys.

@peterbourgon
Copy link
Owner

Okay, I see what you mean.

I think the underlying problem here is an error in the original API design, and that design error's interactions with the code in the repo and the example in the README. One aspect/view on that design error is that anything in diskv uses panic to express error conditions, which is 100% wrong, and 100% my mistake. Another way to describe it would be that the various funcs, including InverseTransform, don't return errors. They should. Or, at least, their return values should be able to express an "invalid" state.

I agree that your proposed change would be an improvement. I would also like to see the example in the README updated to not panic. I'm not sure if that would require a change to the function signature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants