Skip to content

Landric/PANFeed

Repository files navigation

PANFeed

Personalised Academic News Feeds

PANFeed is a service which creates personalised news feeds by using keyword identification in posts combined with a hotness metric to give the user news which is current and relevent to their listed interests. The keywords are generated by using TFIDF to identify the commonality of words in the corpus and in each blog post. Hotness is modifier based on how old the post is.

PANFeed has a simple web interface for submitting feeds and also ships with a web spider for spidering domains in search of RSS and Atom feeds to process. There is also a bin script which should be run frequently to pull more news in from existing feeds.

For instructions on how to install see the INSTALL file

For changes and version history see CHANGELOG

PANFeed is made available under the GNU GPL and is copyright of the University of Southampton. For more information see the LICENCE file.

For more information contact [email protected], see the centeral PANFeed server at http://panfeed.ecs.soton.ac.uk and our googlecode project at http://code.google.com/p/panfeed

Features

  • django based website for creating customized feeds and submitting feeds to be crawled
  • python based web crawler for crawling a domain to find all its RSS and Atom feeds
  • python script to crawl the captured feeds and index them into the corpus
  • generates custom feeds suitable for blog readers and personal magazines

TODO

  • create a statistics dashboard for each feed
  • create a statistics dashboard for each domain
  • create a statistics dashboard about most popular keywords used to generate feeds
  • add a way to request domain spidering through the web interface
  • add an image to ever single post to make it more exciting in flipboard and pulse
  • allow users to create domain specific feeds

About

Personalised Academic News Feeds

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •