Last week I read an excellent blog post by Johanna Kieniewicz on PLOS.org about the combination of art and science. The post details why blending art and science is important and how it can be highly productive for both groups. I am not very artistic (as my previous drawings demonstrate) but I enjoy exploring some of the more aesthetically pleasing areas of my work. Before coming to Cranfield I worked briefly as a consultant for an artist and it was very enlightening look at differing perspectives on the products of science.
Reading the article at PLOS not only reminded me of my brief employment by an artist but also of a pet-project I worked on a couple of years ago that was an attempt to find a way of visualising scientific literature research. The main result of which were some highly artistic images created from real publishing data.
What I started out attempting to do was graphically representing the connections of authors in a specific set of papers. Mapping each of the co-author connections to see where people have moved between groups or have strong collaboration with other groups. Below is an example of what I was aiming for using the author data from my meager collection of 3 published papers.
The above graphic was constructed using the text output on a web of science (WOS) search for my papers. This text output is then processed into something called an author edge map, which is essentially a list of all the connections in the authorship of the papers. This processing is done using a custom bit of code written in labview. The edge map is then converted into a node-edge map using the open-source software Gephi. You can see from the diagram above that I have primarily published with two distinct groups of people, on the right my ex-colleagues at Mediwatch and on the left my current colleagues at Cranfield University. Of course having such a small sample size makes this diagram a little simple so the next step was to use data from someone with a much bigger publishing record, the head of our department Ralph Tatam.
This image is… a mess. A pretty mess, but still a mess. Unlike me, Ralph has a huge number of papers and has been published for a longer time span with a number of different groups. Gephi relies on a complex algorithm to sort and space the individual authors accordingly but because Ralph is linked to every single author, this creates a problem. The simplest solution to this was to remove Ralph from the map, as we know that anyone on the map must be linked to him anyway, there is no point having him on it. That way the resulting map will show better groupings.
The above version of the papers map using only co-authors is much clearer, there are distinct groupings often spreading out from a key individual (a significant proportion were co-authored with Steve James, who is the big dot in the middle). Interestingly each sub-group is often connected by a number of other people to other groups; this is fairly indicative of Ralph working as head of a entire department with a wide range of activities. The various other groups (e.g. the little island on the right) may be from other groups Ralph has previously worked with, it would be interesting to see what correlation there is between the groupings shown here and Ralph’s career.
Where this kind of visualisation comes into its own is for looking at a new field. I had originally started working on this project when I was first using a particular molecule called calixarene. One thing I found particularly hard, was working out who were the big names and which groups were the best to seek advice from. My initial Web Of Science search turned up far too many hits from a highly diverse range of specialties for calixarenes, so I filtered the data for papers on calixarenes using Langmuir Blodgett (LB) coating, as this was the area I would be working in.
Just glancing at the diagram I can see there are a number of groups with one dominant group and a plethora of smaller isolated groups. It was fantastic to have this subject structure so clearly displayed. Realistically the field of Calixs and LB is relatively small and I could have picked out the key names fairly quickly from paper searching, but this mapping method is much easier on the eye!
In order to demonstrate the method a little more widely, below is shown the map I made from plotting all the co-authors in the 500 most cited papers containing the phrase “Long period grating” (LPG) in the title. LPGs are one of the many sensor platforms we use and as a biochemist coming into photonics, mapping this vast field would be very insightful. I had to limit the map to 500 papers as my original search by title produced 2005 papers, which is far too many to deal with.
Just from glancing at the low-res version (click to see the high res PDF) you can see that this field is very interconnected which is a great indicator of just how much collaboration there is within the field. The small group on the far right is the group that originally produced LPGs where as the larger four or five groups in the middle are the people who have taken LPGs to a whole new level. Our group is one of the smaller ones somewhere near the middle if you look hard enough.
Finally I did one for papers on the subject of Erinaceomorphas after a conversation I had on twitter. Why I am now suddenly interested in Erinaceomorphas I’ll cover in a future post (update: link to post). According to the diagram below there is not much collaboration in the world of small woodland mammal research.
Not sure where that grass came from, probably a bug.
Most of the time the only thing holding me back from exploring the artistic side of my work is time. It’s great when a project has a big visual impact (for example our vortex rings) but my focus is still more towards research, so I love having work such this that is both useful and visually striking. And as an added bonus if I take out the names the maps make very nice wallpaper images on my computer.
I will post the software I wrote to make these images from web of science output as soon as I have tidied it up a bit (end of the week) in the meantime I strongly advise that you all start playing with gephi – the open-source program that creates the final image.
Helen Gray · 28 November 2012 at 17:47
I think this may be your most interesting article yet – although I am an artistic Humanities gal so I’m totally biased!
But the concept, of mapping author connections is not only purty it’s useful too… And I can imagine how much easier it would be to *look* at clusters rather than sit down with a giant list of names and play investigative detective 🙂
Can’t wait to see some dialogue on this…
Resolution | Open Optics · 2 January 2013 at 12:06
[…] Learn more python – I did learn a bit of python in 2012 and in general I thought it was an excellent programming language. It still has a few problems but is better than most alternatives (Matlab and labview) I’ve tried before. Over 2013 I will try and convert a number of my programs across to python. The most important obviously being the program that makes pointless, but pretty diagrams. […]