-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for md ref links, others #6
Conversation
src/scrapeLinks.ts
Outdated
/\[.*?\]\((?:<((?:\(.*?\)|.)*?)>|((?:\(.*?\)|.)*?))(?: ["'].*?["'])?\)/gm, | ||
(x) => x[2] || x[1] | ||
); | ||
const mdRefLinks = regexMap( | ||
content, | ||
/\[.*?\]:\s*\n?\s*(.*)/gm, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to simplify the inner part of this regex ((.*)
) bc/ for some reason it wasn't matching URLs in this style.
I've never worked work with Typescript. How do you run scrapeLinks.test.js ? |
@jorgeorpinel I had been using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be best to have some unit testing for these specific kinds of links to prove they work (and continue to into the future) but if this works in the field test pr on dvc.org then it's fine enough to go. We'll need to make sure to deploy to the dist repo, as otherwise the action won't update!
Thanks @rogermparent ! I just noticed your approval but not without committing another change here 😬 so I'm requesting another review since I'm not too sure about it (related to iterative/dvc.org#2567 (review)). |
(I'll look into that unit test if I can meanwhile.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep this in the oven for a little longer, I had assumed this was a higher priority issue than it seems and accepted what I assumed was something we needed to get moving. I'll give it another look, and we should probably wait until we have unit tests if we can.
ecde5d7
to
c3e27b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the tests file quite a bit @rogermparent but this is the jist of it:
[ref]: https://www.ref.com | ||
[links]: | ||
www.links.in/newline | ||
[link-here]: | ||
/just/a/path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test this...
"https://www.ref.com", | ||
"www.links.in/newline", | ||
"/just/a/path", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vs. this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm BTW jest
is failing with
- Expected - 2
+ Received + 0
...
- "www.links.in/newline",
- "/just/a/path",
But if I remove these 2 lines then it fails with
- Expected - 0
+ Received + 2
...
+ "www.links.in/newline",
+ "/just/a/path",
🤔 Yelp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure you're looking at two different tests, one that splits the string into newlines and one that doesn't.
I just removed the newline split one and changed the inner workings so we don't get string arrays as content (by joining them before returning as content). That should solve this weirdness and allow us to pick up links that take multiple lines like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right I thought it had something to do with the new line even though I put \n
in my regex. I thought maybe it was separately matching those links as plan links which is why I added other plain links in the sample (#6 (review) below).
TBH I'm not 100% what the change to the other files do but the tests are passing so thanks! Do you approve this PR now? 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I'm not 100% what the change to the other files do
All just internal stuff, basically this project has a two ways of gathering links: file and GH diff. The GH diff method used to return an array of strings before since I had assumed no links span multiple lines, but that's clearly not the case- now that parser joins the array into a string before returning it, similarly to the filesystem-based parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you approve this PR now?
Sorry for the delay! I'm trying this PR out manually by running it on the CLI, and noticed some things:
- dvc.org's prettier rules are butchering these ref links, forcing them onto the same line
- when forced with
--no-verify
, these links aren't picked up
I test it out manually by running yarn build
, going to a local dvc.org
directory, and adding the MD test string to a random MD file. The links were found when prettier forced them on the same line, but multi-line ones aren't found even after all the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just found a change that sort of fixes it (removing I forgot, local links aren't checked locally! I think after this small patch we should be good to go.+
from the start of the git diff lines), but for some reason /just/a/path
is still not found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Patch is in! I can deploy tomorrow when I'm a bit less tired, I don't want to rush and mess it up. However, merging in the source repo won't break anything in prod so I can approve/merge. Thanks @jorgeorpinel!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey thank you so much @rogermparent ! I see you did some more tweaks to address some of the things you mentioned. Let's see what it does on dvc.org once deployed 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a bit of tweaking, we're ready to go! Deploy will wait for tomorrow.
Would close #4
Also adds support for links in the special dvc.org element
external-link
(rel. iterative/dvc.org#2567 (review))