About the Data
Find more information about where we gather data and how we process data below. But let's state this first: we
love to collaborate with universities and other providers of data to improve the quality of the ScienceFinder.
Please reach out to the team at firstname.lastname@example.org.
Ranking and Scoring
Results in ScienceFinder are ranked. ScienceFinder does not offer advertising or the
option to boost or adjust these rankings. The only way for institutions to affect their ranking is to deliver
more complete data through the existing infrastructures.
All items are ranked based on the same relevance algorithm which uses a full-text match with the keywords.
The relevance of a document is decayed over time using a Gauss
Decay function with a relevance drop of 5% for every year of age after the first year.
A ScienceFinder query combines three document types (Publications, Projects and Startups) to individuals, which
are in turn affiliated to organisations. The top 75 organisations with the highest sum of relevance score of the
documents are returned.
To facilitate access to the often technical content of scientific outputs, ScienceFinder uses a combination of
the PLOS taxonomy and Wikipedia categories for query
Words that are both a PLOS taxonomy as Wikipedia category, will also be found with search terms that are
subcategories of the Wikipedia parent category.
The ScienceFinder connects to various open data sources like
Narcis and Cordis
We use API's to connect to these open data sources to periodically update the data from these sources.
Additionally, ScienceFinder refers to individual university libraries and staff pages to determine affiliation
links. Currently, publication data is periodically harvested from january 1st, 2012 through 2019 for all
Dutch Universities as available in Narcis.
ScienceFinder is ‘downstream’ from these sources, and improvements in their data quality are the
best way to improve the search results in ScienceFinder.
Unfortunately, some sources don’t offer persistent identifiers such as DOIs for their content or ORCIDs for
their authors, which makes deduplication and reconciliation a challenge. Please be aware that results gathered
from ScienceFinder still contain imperfections, as we work on continuously improving the data.
Other data sources
Next to the Narcis and Cordis data we enrich our database with data from Dutch universities that we receive
through close collaboration with the Dutch universities. This is usually data about academic spin-off
companies, industry-academia collaboration and other relevant projects.
If you feel that specific project data is missing, please get in touch and helps us to improve the quality of
the data! Reach us through email@example.com.