CiteSeerx is an evolving scientific literature digital library and search engine that has focused primarily on the literature in computer and information science. CiteSeerx aims to improve the dissemination of scientific literature and to provide improvements in functionality, usability, availability, cost, comprehensiveness, efficiency, and timeliness in the access of scientific and scholarly knowledge.
Rather than creating just another digital library, CiteSeerx attempts to provide resources such as algorithms, data, metadata, services, techniques, and software that can be used to promote other digital libraries. CiteSeerx has developed new methods and algorithms to index PostScript and PDF research articles on the Web. Citeseerx provides the following features.
- Autonomous citation indexing (ACI) - CiteSeer uses ACI to automatically extract citations and create a citation index that can be used for literature search and evaluation. Compared to traditional citation indices, ACI provides improvements in cost, availability, comprehensiveness, efficiency, and timeliness.
- Automatic metadata extraction - CiteSeer automatically extracts author, title and other related metadata for analysis and document search.
- Citation statistics - CiteSeer computes citation statistics and related documents for all articles cited in the database, not just the indexed articles.
- Reference linking - CiteSeer was the first to allow browsing documents using citation links that are automatically generated.
- Author disambiguation - Using scalable methods authors are automatically disambiguated from other authors.
- Citation context - CiteSeer can show the context of citations to a given paper, allowing a researcher to quickly and easily see what other researchers have to say about an article of interest (no longer available).
- Awareness and tracking - CiteSeer provides automatic notification of new citations to given papers, and new papers matching a user profile.
- Related documents - CiteSeer locates related documents using citation and word based measures and displays an active and continuously updated bibliography for each document.
- Full-text indexing - CiteSeer indexes the full-text of the entire articles and citations. Full boolean, phrase and proximity search is supported.
- Query-sensitive summaries - CiteSeer provides the context of how query terms are used in articles instead of a generic summary, improving the efficiency of search.
- Up-to-date - CiteSeer is regularly updated based on user submissions and regular crawls.
- Powerful search - CiteSeer uses fielded search to all complex queries over content, and allows the use of author initials to provide more flexible name search.
- Harvesting of articles - CiteSeer automatically harvests research papers from the public Web but also accepts submissions through a submission system.
- Metadata of articles - CiteSeer automatically extracts and provides metadata from all indexed articles.
- Personal Content Portal - CiteSeer provides certain features such as personal collections, RSS-like notifications, social bookmarking, and social network facilities. Personalized search setting and institutional data tracking is possible. Documents of users can be submitted through an easy to use document submission system.