Saturday, May 25, 2013

Git can facilitate greater Reproducibility n Transparency - Karthik Ram

Reproducible science provides the critical standard by which published results are judged and central findings are either validated or refuted . 
Reproducibility also allows others to build upon existing work and use it to test new ideas and develop methods. Advances over the years have resulted in the development of complex methodologies that allow us to collect ever increasing amounts of data.

While repeating expensive studies to validate findings is often difficult, a whole host of other reasons have contributed to the problem of reproducibility. One such reason has been the lack of detailed access to under- lying data and statistical code used for analysis, which can provide opportunities for others to verify findings. In an era rife with costly retractions, scientists have an increasing burden to be more transparent in order to maintain their credibility. While post publication sharing of data and code is on the rise, driven in part by funder mandates and journal requirements , access to such research outputs is still not very common 

Some examples of free git repository are:

By sharing detailed and versioned copies of one’s data and code researchers can not only ensure that reviewers can make well-informed decisions, but also provide opportunities for such artifacts to be repurposed and brought to bear on new research questions. Opening up access to the data and software, not just the final publication, is one of goals of the open science movement. Such sharing can lower barriers and serve as a powerful catalyst to accelerate progress. In the era of limited funding, there is a need to leverage existing data and code to the fullest extent to solve both applied and basic problems. This requires that scientists share their research artifacts more openly, with reasonable licenses that encourage fair use while providing credit to original authors .
Besides overcoming social challenges to these issues, existing technologies can also be leveraged to increase reproducibility. All scientists use version control in one form or another at various stages of their research projects, from the data collection all the way to manuscript preparation. This process is often informal and haphazard; where multiple revisions of papers, code, and datasets are saved as duplicate copies with uninformative file names (e.g. draft 1.doc, draft 2.doc). As authors receive new data and feedback from peers and collaborators, maintaining those versions and merging changes can result in an unmanageable proliferation of files. One solution to these problems would be to use a formal Version Control System (VCS), which have long been used in the software industry to manage code.
A key feature common to all types of VCS is that ability save versions of files during development along with informative comments which are referred to as commit messages.
Every change and accompanying notes are stored independent of the files, which obviates the need for duplicate copies. Commits serve as checkpoints where individual files or an entire project can be safely reverted to when necessary. Most traditional VCS are centralized which means that they require a connection to a central server which maintains the master copy. Users with appropriate privileges can check out copies, make changes, and upload them back to the server.
Among the suite of version control systems currently available, Git stands out in particular because it offers features that make it desirable for managing artifacts of scientific research. The most compelling feature of Git is its decentralized and distributed nature. Every copy of a Git repository can serve either as the server (a central point for synchronizing changes) or as a client.

This ensures that there is no single point of failure. Authors can work asynchronously without being connected to a central server and synchronize their changes when possible. This is particularly useful when working from remote field sites where internet connections are often slow or non-existent. Unlike other VCS, every copy of a Git repository carries a complete history of all changes, including authorship, which can be viewed and searched by anyone. This feature allows new authors to build from any stage of a versioned project. Git also has a small footprint and nearly all operations occur locally.
By using a formal VCS, researchers can not only increase their own productivity but also make it for others to fully understand, use, and build upon their contributions. In the rest of the paper I describe how Git can be used to manage common science outputs and move on to describing larger use-cases and benefits of this workflow. Readers should note that I do not aim to provide a comprehensive review of version control systems or even Git itself. There are also other comparable alternatives such as Mercurial and Bazaar which provide many of the features described below. My goal here is to broadly outline some of advantages of using one such system and how it can benefit individual researchers, collaborative efforts, and the wider research community.

Insets are Nutritious

This reminds me of when i was young, Termite was a favorite  , after removing the legs and wings, we could eat raw or roast using our "Kinugi - boys grazing jiko".I came to learn the thorax of these bugs is almost purely protein and available in most areas. They were and also are  easy to collect because of their high concentrations around their nests.
Insects are a popular diet staple in many countries. While rural and poor areas are most taken with the idea of eating insects, some countries qualify species as delicacies that come with a hefty price tag due to their effects. Here in Africa, for example, some insects are said to have medicinal as well as spiritual effects if eaten. 

U.N. report published on May said the health benefits of consuming nutritious insects could help fight obesity.
It is noted than over 2 billion people worldwide already supplement their diet with insects.More than 1,900 species of insects are eaten around the world, mainly in Africa and Asia, but people in the West generally turn their noses up at the likes of grasshoppers, termites and other crunchy fare.

The authors of the study by the Forestry Department, part of the U.N. Food and Agriculture Organisation (FAO), said many insects contained the same amount of protein and minerals as meat and more healthy fats doctors recommend in balanced diets.
One of the many insects that are eaten in Africa is the desert locust. This insect is eaten especially in Algeria by mostly poor people after being soaked in salt water and roasted under the sun.Another insect that is popularly eaten here in Africa are caterpillars. Many different species are consumed, and each has a distinct form of preparation. One of them is where the legs are removed, and then the caterpillar is deep fried.

Wasps, beetles and other insects are currently “underutilised” as food for people and livestock, the report says. Insect farming is “one of the many ways to address food and feed security”.
The report also said this would be environmentally friendly since most insects arelikely to produce fewer environmentally harmful greenhouse gases than other livestock.
The ammonia emissions associated with insect-rearing are far lower than those linked to conventional livestock such as pigs, says the report.

Insects can be a valuable and plentiful source of nutrition. Edible species can be found in all habitats on earth, and can prove to be visceral for people's existence in many countries. While most of us may find them gross, remember that even the processed foods we consume have a percentage made up of insects! We just don't see them.

Sunday, August 7, 2011

Twenty-One Suggestions for Writing Good Scientific Papers

Twenty-One Suggestions for Writing Good Scientific Papers: Notes on Writing Papers and Theses
By Ken Lertzman (Bulletin of the Ecological Society of America 1996) modified by

Dr. Michael D. Delong.

1. Know your audience and write for that specific audience.

Scientific and technical writing can almost never be 'general purpose'; it must be

written for a specific audience. For the kinds of writing addressed here, that audience

will generally be the community of ecologists who read a particular journal or study a

particular subject. This community is represented by your professor for class papers.

In all cases, you must adopt the style and level of writing that is appropriate for your

audience. Stylistic conventions and acceptable jargon can vary tremendously from

one field to another, and to some extent, from one journal to another. If you are

unfamiliar with the conventions of a field, study them as they are manifested in a

selection of highly regarded papers and in the "Instructions for Authors" for key


2. Your supervisor/professor is not here to teach you basic grammar and spelling.

The more time and emotional energy she or he spends on correcting basic English

usage, the less remains for issues of content or fine-tuning. You are responsible for

mastering the basics of the language; save your supervisor's time for more substantive

issues. A few glitches and non-parallel tenses will sip through your own careful

editing, but there is no excuse for frequent ungrammatical sentences. Similarly, with

word processors and spellcheckers having become standard writing tools, typos or

other spelling errors should be very rare. Use a spelling checker before submitting

anything for anyone else's reading.

If you find you are about to submit a paper that you know contains poor writing,

consider why you are doing so. If there is a writing problem with which you are

having a hard time (for instance, organizing the structure of an argument in its most

effective form), it is legitimate to submit this for someone else's review with the

problem highlighted as a focused request for assistance. Otherwise, submitting a piece

of writing with known errors or problems means either: (1) you do not consider your

writing worth improving, (2) you do not respect the reader enough to present writing

that is as good as you can make it, or (3) you are incapable of improving the writing.

Every piece of writing, at some point, is as good as its writer can make it without

outside review. That is the time to give it to your supervisor.

3. Do Not Turn in a First Draft!

Ever! Most people's first drafts are terrible. I would not make anyone else suffer

through mine. Don't make others suffer through yours. I have read early drafts of

papers by eminent ecologists whose final products are jewels of English construction.

Their first drafts are terrible too. "Good writing is rewriting," and you should make

a serious effort at editing, rewriting, and fine-tuning before you give the manuscript to

anyone else to read. There are few things more frustrating to read than a paper in

which you know there are pearls of wisdom, but where sloppy and ambiguous writing

hides those pearls. The chapters of my Ph.D. thesis had been through 3-5 drafts before

anyone on my advisory committee ever saw them. If you need to put a piece of

writing away for a few days before you can approach it dispassionately enough to

rework it, do so.

It takes much longer to read poor writing than good writing. It is a waste of an

advisor's or editor's time to read material that is not yet ready to be presented - and it

is disrespectful to expect them to do so. When an advisor receives a thesis in which -

the writing is poorly developed, expect them to go through enough of it to

demonstrate the kinds of changes required, and then return it with the rest unread.

Consider forming a mutual editing team with other students to review each

other's work. Publication quality scientific writing is usually a product of the

research community rather than the sole effort of the author(s): reviewers and

editors make a big difference to the vast majority of published papers. You

should become accustomed both to reviewing other people's work and to having

your own reviewed.

4. Get and use stylebooks.

All aspiring ecologists should have a library of books that supports their technical

communication. Distinguish between those that are primarily manuals of accepted

rules, those that address how to create a draft (e.g., disconnecting the creative from

the critical voice, etc.), and those that focus on rewriting. I recommend Williams

(1990) as a manual for rewriting. Williams focuses on how to turn a draft into a

finished product.

5. Avoid abusing word forms.

Use words in the form that conveys your meaning as clearly and simply as possible. A

variety of writing problems arise from using verbs and adjectives as nouns. Such

word forms are called nominalizations (Williams 1990). Consider the sentence, "The

low rate of encounters was a reflection of population density reductions." The verbs,

"to reflect" and "to reduce" are used as nouns, and the sentence is more turgid and less

direct than when they are used as verbs: "The low rate of encounters reflects a

reduced population density." Some nominalizations are both useful and effective, as

in "taxation without representation." Williams (1990) has an excellent discussion of

useless and useful nominalizations.

Creating awkward phrases where nouns and verbs are used as adjectives or adverbs

is another common problem leading to awkward and wooden writing. In his delightful

critique, Hildebrand (1981) called nouns used this way "adjectival nouns.' Such

constructions are almost invariably clumsy and unclear. For instance, unless

specifically referring to a document, "the Chilko Lake park proposal" is not as good

as "the proposal for a park at Chilko Lake." The first form illustrates both a

nominalization ("proposal" as noun vs. verb) and adjectival nouns ("Chilko Lake" and

"park" as adjectives modifying "proposal" rather than nouns).

6. Do not use more words where fewer will do.

Do not use long words where short ones will do. A good example is using "utilization'

when "use" will do. Do not use jargon where regular language will do. Another

example is the use of "in order to." Any time you write that phrase, delete it and

replace it with "to." You will find that it does the job nicely. Do not use special words

to make your writing seem more technical, scientific, or academic when the message

is more clearly presented otherwise.

7. Use an outline to organize your ideas and writing.

When you first start a writing project, make an outline of the major headings. List the

key ideas to be covered under each heading. Organize your thinking logic and the

logic of your arguments at this level, not when you are trying to write complete,

grammatical, and elegant sentences. Separate out the three tasks of: (1) figuring out

what you want to say, (2) planning the order and logic of your arguments, and (3)

crafting the exact language in which you will express your ideas.

It is very easy to write and expand outlines with word processors. When starting a

writing project, I create a file in which I first develop an outline as described above. I

save a copy of the outline separately and then commence the writing by expanding the

outline section-by-section. I usually get ideas for later sections while writing earlier

ones and can easily page down and write myself notes under later section headings.

This is especially useful for filling out the structure of a Discussion while writing the

Results. (for instance," When discussing the removal experiment, don't forget to

contrast Karamozov's 1982 paper - his Table 3- with the astonishing results in Figure

7.") By the time I get to writing the Discussion, the outline has usually been fleshed

out substantially and most of the topic sentences are present in note form.

8. Think about the structure of paragraphs.

Poorly structured paragraphs are one of the most common problems found in student

writing. Though most students can write reasonable sentences, a surprising number

have difficulty organizing sentences into effective paragraphs. A paragraph should

begin with a topic sentence that sets the stage clearly for what will follow. One of my

most frequent comments on student papers in that the contents of a paragraph do not

reflect the topic sentence. Make topic sentences short and direct. Build the paragraph

from the ideas introduced in your topic sentence and make the flow of individual

sentences follow a logical sequence.

Many writers try to finish each paragraph with a sentence that forms a bridge to the

next paragraph. Paying attention to continuity between paragraphs is a good idea.

However, such sentences are often better as a topic sentence for the following

paragraph than a concluding sentence of the current one. It is nice to conclude a

paragraph by recapitulating its main points and anticipating what follows, but you

should avoid statements of conclusion or introduction that contain no new information

or ideas.

Strive for parallelism in structure at all times. When you present a list of ideas that

you will explore further ('Three hypotheses may account for these results: hypothesis

1, hypothesis 2, hypothesis 3.), make sure that you address the ideas in the same

sequence and format in which you have presented them initially. It is both confusing

and frustrating to read a list presented as '1, 2, 3, 4,' and then find the topics dealt with


Think about how the structure of your paragraphs will appear to the reader who is

reading them for the first time. The reader should not have to read the text more than

once to understand it. Carefully lead the reader along so that the structure of your

argument as a whole is clear, as well as where the current text fits in it.

Paragraphs containing only one or two sentences are rarely good paragraphs because

they can't develop ideas adequately. Two-sentence paragraphs usually represent either

misplaced pieces of other paragraphs or fragments of ideas that should be removed or

expanded. Outlining helps pull topics together. They may initially appear that a

separate paragraph is needed to define each when, in fact, the topics are quite related

and can be included in the same paragraph.

Choppiness both within and among paragraphs often results from the ease with which

we can cut and paste text on the computer. Ideas that were written separately but

belong together can be moved easily. Unfortunately, they often still read as if they

were written separately. This is a great way to structure a draft. However, you must
read over such text for continuity before submitting it to others for review.

It is difficult to read for continuity on the computer screen because you can see so

little text in front of you at any given moment. It is also more difficult to flip over

several pages to scan for repetition, parallel structure, etc. To do a really good job of

proofing a paper, most writers find it necessary to read hard copy at some point

during the writing/rewriting process. Print all but final drafts on paper that has been

used previously on one side.

9. Pay attention to tenses.

Problems of inappropriate or inconsistent tenses are common in student writing. What

you, or others, did in the past should be stated in the past tense (e.g. data were

collected...."). Events or objects that continue to happen or exist can be described in

the present tense (e.g., "in this paper, I examine....... The data reject the hypothesis

that......). Events that will take place in the future can be in the future tense. Whatever

tense you choose, be consistent. Be careful in using "might," "may," and "would" (as

in "this might indicate that..."). They are frequently used as ways of weaseling out of

making a clear statement.

10. Captions should not merely name a table or figure, they should explain how to

read it.

A caption (figure or table heading) should contain sufficient information so that a

reader can understand a table or figure, in most cases, without reference to the text.

While very simple tables and figures may require only a title for clarity, and

exceptionally complex ones may require reference to the text for explanation, these

circumstances are rare. Captions are often most effective when they briefly

summarize the main result presented in the table or figure. Do not leave caption

writing to the end of the project; write captions when you organize your Results

section and it will help you write the text.

11. When citing a reference, focus on the ideas, not the authors.

Unless the person who reported a result is an important point in a statement, literature

citations should be parenthetical, rather than in the body of the sentence. For instance,

in most cases, it is preferable to write a sentence of the form "Though mean growth

rates in Idaho were < 1 0 cm per year, growth rates of > 80 cm are common in

populations in Alberta (Marx 1982)." rather than "Though mean growth rates in Idaho

were < 10 cm per year, Marx (1982) found growth rates of >80 cm to be common in

populations in Alberta.' Sometimes the identity of the writer is important to the

meaning of a statement, in which case emphasis on the citation is appropriate (e.g.,

"Jones (1986) rejected this hypothesis, however, Meany's (1990) reanalysis of his data

failed to do so.').

12. Show us don't tell us.

Rather than telling the reader that a result is interesting or significant, show them how

it is interesting or significant. For instance, rather than 'The large difference in mean

size between population C and population D is particularly interesting," write 'Mean

size generally varied among populations by only a few centimeters, but mean size in

populations C and D differed by 25 cm. Two hypotheses could account for this...."

Rather than describing a result, show the reader what they need to know to come to

their own conclusion about it.

13. Write about your results, not your tables, figures, and statistics.

Confusing and disjointed Results sections often arise because the writer does not have

a clear idea of the story she/he intends to tell. The frequent consequence of this is a

Results section consisting of a long, seemingly unrelated sequence of tables and

figures. We often go through a lengthy and convoluted process in understanding the

content of a data set; your paper need not document all the twists and turns of that

process. Expect that you will produce many more figures and perform many more

statistical tests than will be included in the final written product. When preparing to

write your results, decide on the elements of the story you wish to tell, then choose

the subset of text, figures, and tables that most effectively and concisely coveys your

message. Organize this subset of tables and figures in a logical sequence; then write

your story around them.

Novice writers of scientific papers frequently pay too little attention to discussing the

content of tables and figures. They sometime merely present a list of references (e.g.,

'Table I shows this result, Table 2 shows that result, Figure 1 shows the other result.").

When writing Results sections you should use the tables and figures to illustrate

points in the text, rather than making them the subject of your text. Rather than

writing, "Figure 4 shows the relationship between the numbers of species A and

species B," write "The abundances of species A and B were inversely related (Figure

4)." Distinguish between your scientific results and the methodological tools used to

support and present those results.

14. Focus on ecological hypotheses, not statistical hypotheses.

Most students have learned the importance of having and testing clear hypotheses.

Unfortunately, many focus their writing on statistical hypotheses, not ecological

hypotheses. Statistical hypotheses are generally a trivial consequence of standard

approaches to statistical inference, such as the null hypothesis of no difference

between two populations. They rarely have inherent ecological significance and are

meaningful only in the context of the specific test being performed. Focus your

writing on the ecological hypotheses underlying your research (e.g., that species A is

influenced by processes X and Y in a specific way, resulting in different growth rates

in habitats S and 1), not the statistical null hypotheses required to test specific

predictions of those ecological hypotheses (e.g., there is no difference in growth rates

among populations of species A in habitats S and 1).

15. Develop a strategy for your Discussion.

Many novice paper writers begin their Discussion section with a statement about

problems with their methods or the items in their results about which they feel most

insecure. Unless these really are the most important thing about your research (in

which case you have problems), save them for later. Begin a Discussion with a short

restatement of the most important points from your results. Start with what you can

say clearly based on what you did, not what you cannot say or what you did not do.

Use this statement to set up the ideas you want to focus on in interpreting your results

and relating them to the literature. Use sub-headings that structure the discussion

around these ideas.

16. Introductions and conclusions are the hardest parts - plan on spending a lot of

time on them.

Many technical writers prefer to write their introductions last because it is too difficult

to craft that balance of general context and specific focus required for a good

introduction. Often it is easier to achieve this after you have already worked through

writing the entire paper or thesis. If you need to write the introduction first to set the

stage for your own thinking, resist the temptation to perfect it. The introduction will

likely need substantial modification by the time you have finished the rest of the

paper. The same concerns apply to conclusions, abstracts, and summaries. These

components of the paper are all that many people will read, and you must get your

message across in as direct, crisp, and enticing a manner as possible. Plan on taking

your time and giving these components several more drafts than the rest of the paper.

17. Break up large projects into small pieces and work on the pieces.

Don't write a thesis; write chapters or papers. Many thesis writers have a hard time

starting to write because they are intimidated by the huge project looming ahead of

them. As a result, their first few months' efforts are often awkward and disjointed, as

well as sparse. The thesis should be separated into small discrete sections, ideally

distinct publishable papers. The overall organization of ideas should be done during

the planning stage so that when you work on individual sections you can concentrate

on them.

Don't wait until you think you have completed all your analyses to start writing.

'Parallel processing' of writing one chapter while you complete the analyses for others

and make presentation quality figures is a good strategy for avoiding writer's burnout.

Writing and analysis for any given chapter or paper is often an iterative process.

Writing the results section of a paper is often the best way to discover the analyses

and figures that still need to be done.

18. Make your writing flow and resonate.

Probably the most frustrating and useful review I have received was from my master's

advisor on a draft of a paper from my M.S. thesis. He said that all the key points were

there and that the writing was clear, but it did not 'flow and resonate.' He sent me back

to rework it, and, eventually, the published product did 'flow and resonate' (at least we

thought so).

Once or twice a year I come across a paper that is written so well it is a joy to read. If

the content is as good as the writing, the experience of reading it can shape my


for some time thereafter. Papers written so well that they 'flow and resonate' are much

more likely to influence your readers than the equivalent message presented in a form

that is merely clear. When you find a paper that succeeds in this, study carefully how

the authors constructed their augments and used language; try to identify what makes

the paper work so well.

19. Use word processors effectively and back up your work religiously.

Computers have improved tremendously the ease with which we can edit, shuffle,

rewrite, and spell-check a paper. To do this efficiently requires investing time in

learning about your tools. You need not learn how to use all the more exotic features

of your word processor, but learn the options that are available and how to find out

the details when you need them. Minimally, be familiar with basic requirements for

document formatting (character and paragraph formatting, how to make lists with

hanging indents, page organization, etc.) and basic operating system requirements

(copying and saving files, doing directory searches). The same comments apply to the

use of statistical packages, graphics programs, and spreadsheets. It is often possible to

get the job done with little finesse in manipulating your software, but you will usually

do a better job more efficiently after some investment in technical skills.

Almost everyone seems to require their own personal disaster to convince them of the

need for backing up important files regularly. The frequency of 'lost file' based

excuses for late papers is remarkable. I save files to my hard drive frequently during

working sessions and at the end of each session I make a back-up copy of any file that

I would mind losing. The working memory of your computer is transitory and easily

purged of its contents. Individual hard and floppy disks are little better as permanent

storage forms. Redundant copies dispersed in space and time are your main hope for

avoiding disasters. When you have invested a lot in a writing project such as a thesis

that is nearing completion, keep at least one at school at all times - in addition to your

working copy on a hard drive. Keep sample hard copies of recent drafts until you

complete the project.

20. Take editorial comments seriously.

It may be clear from an editor's comments that they did not understand the point you

were making. If so, that is a clear indication that you need to improve your writing.

Also, an editor, no matter who they might be, has invested their time to help improve

the quality of your writing. Respect their investment.

21. One Last style suggestion: limit the use of prepositional phrases at the start

of sentences and limit the use of 'the.'

It is very easy to start a sentence with a prepositional phrase, however, it often causes

the main point of the sentence to be lost. Reread a sentence that starts with a

prepositional phrase but place the phrase somewhere within the sentence, even at the

end. You will often find that the sentence reads more clearly with the prepositional

phrase buried within the sentence or that you do not need the phrase at all.

'The" is probably the most overused word in the English language. When rewriting

your first draft, think about whether or not the placement of every "the" is necessary.

For example: "The samples were taken using a Ponar dredge" reads Just as well when

written as 'Samples were taken using a Ponar Dredge." The only difference is the

latter sentence is neat, tidy, and to the point.


The materials presented are taken, with permission, from an article by Ken Lertzman

(Bulletin of the Ecological Society of America 1996). I have made some additions

and modifications; however, credit for this work should be given to Dr. Lertzman.

Literature Cited

Hildebrand, M. 1983. Noun use criticism. Science 221:698.

Williams, J.M. 1990. Style: toward creativity and grace. University of Chicago Press,

Chicago, Illinois, USA.