Search Archive

19 Dec 2007

Impact Factor / “Lies, Damn Lies and Statistics”

There is a very popular saying - “There are three kinds of lies: lies, damned lies, and statistics.” So, “Statistics” has been labeled as one type of lies. This is due to the fact that even the most accurate statistics could be used to support inaccurate arguments.

“Impact Factors” are also statistics derived out of academic citations. If not understood properly could be misused to support wrong arguments. Need not say again that these too are statistics after all. These numbers originated from the “Information Storage and Retrieval” domain as the by-product of citation indexing. However are being increasingly (mis)used in measuring the academic excellence of individual scientists.

If we go about 60 years back, we know that in 1950s, lot many things were happening in the Information Storage and Retrieval domains. Machines (Early Computers) had shown the promise of information revolution. However, natural languages were found to be inadequate to map concepts with words or phrases. Major problems were mainly due the use of synonyms, homonyms, gender, numbers, order of words in phrases and punctuation marks in natural languages. These grammatical goodies enrich natural languages and make sense to human brains. However in the domain of information retrieval these inhibited one-to-one mapping between concepts and words. So controlled vocabularies like MeSH were being developed.

Mr. Eugene Garfield however thought of innovative way of mapping between concepts by relying on the citations among articles. It goes like this; if “A” cites “C” and “B” also cites “C” then “A” and “B” are referencing to same concept. He did not use “words” to map “concepts” and hence avoided the problems associated with languages.

(In 1958, Eugene Garfield starts ISI with a loan of US$500. - Science for Sale - Features - The Lab - Australian Broadcasting Corporation's Gateway to Science)

Garfield developed the concept of citation indexing in 1950s. A product based on his citation indexing - “Science Citation Index (SCI)” was released officially in 1964. It was meant to be a tool of scientific information retrieval. SCI become more popular for its utility to measure scientific productivity rather than being a search engine. It was due to its by-product “SCI Journal Citation Reports (JCR)” launched officially in 1975. A parameter was evolved to compare various large journals and this parameter become to know as Journal Impact Fact. In the absence of any other objective criteria – Journal Impact Factor has been widely accepted parameter for comparing the “quality” of journals. E. Garfield himself describes the rise of SCI and Impact Factors in the following article.

- E. Garfield. The evolution of the Science Citation Index (2007) http://www.iec.cat/1jcrc/GarfieldEEvolution.pdf

However, one should always remember that “Impact Factors” that are calculated are only applicable to journals included in ISI database (SCI). I am very frequently encountered by this question – what is the impact factor of some Indian journal? Well very few Indian journals are covered in SCI database, so except for few, most do not have impact factors. I feel like replying –There are three kinds of lies: lies, damn lies and impact factors! Because I know that these “impact factors” being applied to areas for which they were never meant. It sometimes becomes very difficult to explain that it undercounts the number of citations from journals in less-developed countries. The research priorities for less-developed countries could be entirely different that of developed countries. If the research is not of contemporary interest of journals indexed in ISI databases it is unlikely to be cited. Linking research to citations and impact factors could lead to a situation where the less-developed countries fund for the research problems of the developed countries. Academic administrators - without knowing complexities of the Impact Factors – are now increasingly temped to apply these to measure the individual productivity of scientists. The following article explains the problem with such “numbers”.

- Richard Monastersky. "The Number That's Devouring Science", The Chronicle of Higher Education, October 14 2005.

Here is one more reference:

- P.O. Seglen. Why the impact factor of journals should not be used for evaluating research. (1997) BMJ 314(7079): 498-502. PubMed: 9056804.

[Try - http://www.bmj.com/cgi/content/full/314/7079/497 for full text]

However, there is one good thing about these “Impact Factors”. They at least are best at indicating the "popularity" of journals. Well - does "popularity" and "quality" means the same?

One more thing - they arose from “citation indexing” which the founders of Google used to develop a killer search engine.


1 comment:

Unknown said...

It is very nice and informative. its an eye opener for many.