-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate a list of data sources for drugs and their uses #14
Comments
Listing a few resources that people have already mentioned in Slack or elsewhere: |
adding to the list: |
FYI - waiting for access to the data.world account to finalize some changes before pull requesting, but I've finished tidying the KEGG USP Drug Classification data. Looks like it has info on non-Medicare drugs including drug categories and classes. For example, Naproxen's drug category is I'm not sure yet if this data also includes Part D Medicare drugs, or if that's in another dataset and this just has more common drugs - I think the Medicare drugs will be in the USP MMG dataset (which unfortunately doesn't seem to be on KEGG...). More info on this here. |
@cduvallet I just added you to the data.world organization. Thanks for taking this on! |
I've requested a UMLS license for the full RxNorm data |
@peter0083 - are you working on the Merck Manual? If not, I'll take a stab at scraping it - seems quite useful! |
@cduvallet - actually I haven't been able to work on Merck Manual yet. Please go ahead! 😃 |
Updates after some poking around the Merck Manual and Medline Plus (I've seen a few other people working on RxNorm so I've left it alone for now...)
Moving forward, I think that:
|
I took a stab at scraping this. I was able to get the base table which is a xref of generic name - brand name. To even get this required that I download the source html and then parse the file. The script is currently on my fork. Their site is not allowing direct url access. You must use the menu system from what I can gather. Even pasting the direct url in a browser doesn't work. The same goes for the pop up windows that have all that additional information. They are being called by javascript functions from what I can tell. I looked at their consumer site and it's the same story. I'll continue to work on this. |
This is heroic work, you guys - thank you! I definitely agree that the Merck data could be helpful in general if we can get it without too much pain. @TBusen thanks for working on it - keep us posted; if it's painful we can talk about whether it's worth it! @cduvallet to your very valid questions - the main thing we're after at this moment is a way to "translate" drugs into pharmaceutical and therapeutic classes. The CMS data already has both brand and generic drug names, but if we want to look at medications for diabetes/anxiety/etc, we need a way to group them. The USP data you put together might do this well if it covers the Medicare Part D drugs; perhaps we can add a new issue for someone to try combining them and see what we get. You've got the deepest understanding of that data - how does that sound to you? |
Yes, happy to take a look this weekend or early next week! (I'm at a conference all day tomorrow and Friday). Do you want to make a new issue for this and assign it to me? Also, since @TBusen seems to be chugging away at the Merck Manual scraping, perhaps that can become its own issue too? Also, FWIW I think it'll likely be worth it - the Merck Manual has actual disease and condition names, rather than broad therapeutic or biological classes... ;) |
Created issue #50 to create a dedicated ticket for the Merck Manual effort. If anyone needs help with scraping, feel free to contact me directly for help -- I've done a little bit of work in this area and may be of use. |
Epocrates https://online.epocrates.com/drugs also has a comprehensive list of drugs with their uses... But has anyone looked into whether scraping Merck or Epocrates violates the user agreements, esp. if the results are posted publicly on data.world? |
@davidlibland Thanks for reminding us on that. I've got an email out to Merck now, but I specifically left out the idea of us publishing their data on data.world, because let's be real -- that's never gonna happen. 😢 But, they may let us use their data privately, which works almost as well! 👍 |
Another good source of data: http://www.pdr.net Probably a long shot due to their TOS, but parking it here for completeness. |
@mgawarecki Thanks for contacting Merck, hopefully they will be happy to support the project (they might like the publicity for their data source). I've also noticed that google has a good deal of info pop up for many of the medications on the list. Has anyone tried to catalogue that? Otherwise, if I get the chance, I'll see if I can do anything to capture it.
… On Feb 20, 2017, at 6:57 AM, Matt Gawarecki ***@***.***> wrote:
@davidlibland Thanks for reminding us on that. I've got an email out to Merck now, but I specifically left out the idea of us storing their data on data.world, because let's be real -- that's never gonna happen. 😢 But, they may let us use their data privately, which works almost as well! 👍
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@davidlibland Looks like Google even has a listing of which sources they use and how they gather that data: https://support.google.com/websearch/answer/2364942?p=medical_conditions&visit_id=1-636232883305572138-808022200&rd=1 Hope that helps as you look into it! |
@TBusen @jenniferthompson @davidlibland I got a reply back from Merck just now regarding our use of Merck Manuals. As much as I hate to say it, it would appear we can't use their data in our work. 😢
|
Oh, that's a bummer. Grateful for a quick response from them. |
man...... I thought Merck would be more generous with us because we were non-profit. 😞 |
Aw, well, maybe write back and ask her what other source she recommends? |
I think it's safe to assume the discussions in this ticket have led to a number of new tickets around linking drug names and uses. Of course, the Merck Manual didn't pan out the way we'd hoped, but other data and matching efforts have fared much better. Closing this issue so we can focus on using the data we've found. |
Task
Create a list of data sources linking drug names to their uses and/or other similar drugs. Post the end result in this issue so we'll have a record we can look back on.
How this will help
Issue #6 also relates to drug uses; specifically, it seeks to link the drugs in our Medicare Part D data set to their respective purposes. While we started #6 with a good set of data to work from, there's a growing list of places we can look to get more information. With the right set(s) of eyes, we might be able to cover the CMS drug list more comprehensively. Before we can do that, though, we need to actually list out all the sources we're aware of.
The text was updated successfully, but these errors were encountered: