BITS and Books

We live in the age of big data analysis. Nowadays, data analysis is being applied to all fields under the sun, and no field is immune to it. So why not apply it to something which has been a popular pastime for many decades. I refer to the practice of preparing a list of popular books or movies etc.

            BPHC is blessed with a very user-friendly library with a large collection of books. The library has an online catalogue which most of us would have used at some time or the other. Once, when I was searching the catalogue for some particular book, my attention was caught by the button “Most popular” which appears just below the search button.

            Since, like many of you, I am a clickomaniac, I clicked on the “most popular” button without giving it a second thought and out popped a list of the most popular books in the library. Here, popularity is defined by the total number of times a book has been checked out of the library.

                        Our library has a good collection of fiction titles and given the popularity of novels among the residents of the campus; one would expect a novel to be the most checked out book. Surprisingly this is not true! The book with the maximum number of checkouts (420) is “Achieving results” by Lorna Riley. That sounds like what is referred to as a “self-help” book. There were 6 other self-help books in the top 100 most popular books.

            This unexpected finding aroused my interest and I started a systematic analysis of the 100 most checked out books in the library. My methodology was quite simple: I copied the data from the library site onto a Microsoft excel file which simplified the process of sorting and categorising the books.

            It is not difficult to guess the most popular author. As I mentioned, novels are most popular, especially among students and murder mysteries more so. The most popular author with 11 books in the top 100 is Agatha Christie (675 checkouts in total). She is closely followed by J.K.Rowling, the author of the Harry Potter books (8 books, 583 checkouts in total). The top two ladies in the list are followed by two gentlemen: Paulo Coelho and Dan Brown. The other popular authors are mentioned in table 1. Ironically, you will notice from table 1, that Chetan Bhagat is level with Isaac Asimov in this list!

Table 1. Most popular authors (fiction)

Author Number of books in the top 100 Agatha Christie 11 J.K.Rowling 8 Paulo Coelho 7 Dan Brown 6 Jeffery Archer 4 Tripathi Amish 3 Rick Riordan 3 Chetan Bhagat 2 Isaac Asimov 2

 In all, fiction constitutes 64 of the top 100. The category-wise break-up of the top 100 books is given in table 2.

Table 2. Category-wise breakup of top 100

Category Number of books in top 100 Fiction 64 academic / textbook 20 non-fiction 9 Self help 7

  So academic books constitute only 20% of the top 100. I leave the interpretation of this fact to your imagination. Unsurprisingly, twelve out of the 20 academic books in this list belong to Computer Science. The other disciplines are mentioned in table 3.

Table 3 Discipline-wise break-up of academic books

Discipline Number of books in the top 100 Computer Science 12 Mechanical Engineering 4 Physics 2 Chemical Engineering 1 Electrical & Electronics Engineering 1

You might be wondering which are the books which come under non-fiction (table 2). This category covers a broad area from popular science (e.g: Brief History of time) to books on history, economics, and mythology. While the discipline-wise categorisation was done manually, it is possible to automate this using the “call number” of a book which is listed with the information about the book.

            I have spent quite some time analysing and discussing this list, and one of the things you would have realised is that such a list is just a snapshot[1] taken at a particular time. This list may change over time, and of course, this is only small data analysis, not big data analysis. But this list does give us a peek into the reading habits of BPHC.


</p>

[1] The data for this article was collected in the summer of 2019.