Disgruntled PhD: psychology

Showing posts with label psychology. Show all posts

Thursday, September 16, 2010

Sweave, LaTeX and R

Yesterday, I finally got the hang of Sweave, R and LaTeX.

This essentially means that I can write my scientific paper in LaTeX, insert code chunks in the text, feed it to R (through the Sweave package) and get perfectly formatted output in APA style for any paper I choose to write. Its taken me a few months of on and off trying, but I've finally done it. That being said, I'd like to share some of the things that caught me out, so that others can benefit.

Before Installation:
If you've been using LaTeX or R on Windows, they were probably installed in the Program Files folder. This will cause you no end of problems.
Re-install these programs on C, in a path with no spaces, as LaTeX doesn't like spaces in path names. For example, install LaTeX in C:\Miktex and install R in C:\bin\R. This will head off a lot of problems that you will encounter.

When installing LaTeX be aware that there are a number of text writing programs which you can use. Of these, I am using TeXniccenter, as it came with my distribution. Its also open source, which is a plus. Others include WinEdT, which is shareware and apparently quite good. Vim, Emacs Speaks Statistics are the only text editors that provide completion for R code, but both of these programs take a lot of effort to learn. In any case, working entirely from LaTeX as the start is very difficult.

The next step is learning to use R. If you are a psychologist, download the arm package and the psych package, as this will give you useful regression diagnostics, and psych provides all the psychometric tools one could need (see here for the authors website, which i devoured when i started laerning R). Unfortunately it doesnt provide IRT methods, but these can be accessed through the ltm and eRm packages, which are also easy to obtain (Select the install packages option in R menus, select a mirror site close to you and select the package name - done).

For exporting to LaTeX, there are a number of packages which do different things. I'm currently using xtable, but this doesn't have a defined outfit profile for factor matrices, which is pain. The manual does show you how to define new classes though, and I will share my results when I have made this work.

One extremely important thing to remember (and something that stumped me for a while) is the syntax for inserting R code.
<>=
some R code
@
The double arrows and the equals sign signal the start of the R code chunk, the options within the arrows define how the output looks (echo=FALSE means that the R code will not be shown, and results=tex tells LaTeX to format the results in its own format). The @ sign ends the code chunk. Now, the part that got me was this: the R code, arrows and @ need to be left justified, otherwise this does not work. This means that if you want to insert a table from your results, do this after running the code through Sweave.

At the moment (since I am neither an expeRt or a TeXpert) I am creating the LaTex objects in R, and then telling R to print them to LaTeX. This allows me to ensure that the objects are created properly before I send them to LaTeX. Sweave will tell you if it has a coding problem and where that occurred,but some things look OK until you actually see them.

The next step is to download the apa package for LaTeX, this will allow you to format the paper in APA style. This is the part that tends not to work if your LaTeX distribution path has spaces in it, so make sure that doesn't happen (I actually reinstalled R and LaTeX on my machine in the recommended places, and now it works like a dream).

You will probably need to learn a little LaTeX, but if you use WinEdT, TeXniccenter or Lyx, then there is GUI with menus that can aid in this. There are some Sweave templates scattered about the web, and you should probably use one of these. Its probably worth reading either this or this (or both) guide to using LaTeX.

With R, as long as you understand some statistics, its easy enough to Google and then read the recommended files. The introduction is extremely terse and focuses on fundamentals rather than applied analysis, but its useful for its description of summary, plot, lm and the other really useful generic functions.

Sunday, August 22, 2010

Grad school, Irish style.

Taking some time off from my placebo series, I'd like to talk about my experience as a Phd student in Ireland.

This is somewhat inspired by the zomg grad school blog carnival, but i was too busy to submit in time.

Its also inspired by the fact that everyone who submitted to that carnival was a natural scientist, which impels me to give the social science side of the equation.

First, a few notes on Irish phd's versus the american grad school experience.
First off, there is very little funding, Ireland is in a depression at the moment, and never really put much money into the social sciences before that.
Secondly, what funding there is (in my area at least) tends to be awarded to the student rather than to a PI. Luckily enough, i did get funding (although it doesnt cover conferences or expenses, which sucks).

Also, there tends not to be many courses, you are essentially thrown into research, which I prefer but which many people would not find appealing. I sometimes wish that I'd had people to explain methods and stats in a lot of detail to me at the start, but then again, learning that stuff myself has been extremely rewarding.

So, without further ado, here are my top ten tips for surviving a PhD.

1) Do something you like - this is extremely important, as if you don't like your thesis, its unlikely that you will finish on time or that anyone else will care. Liking your Phd also makes it easier to write good grant applications.

2) Try to figure out what you want to do, in some detail, ASAP. This again is critical to finishing on time. Don't worry if your methods or approach changes, just figure out your key question and how you are going to assess it. Then draw up a schedule. You won't stick to it, but it can often be a spur to ensure that you keep working.

3) Work consistently. This was really difficult for me, as I was always a crammer in school and undergrad. However, this will not work for a PhD (if you want to finish on time at least) so get into the habit of doing some work at least 4 days a week. This is very important when you are, like me, an independent scholar without compatriots in a lab somewhere.

4) Read outside your discipline, especially for methods. Often, the methods in your field will be some amalgam of tradition, stupidity and lack of thought. Other disciplines can often point out the blind spots of your own.

5) Read, read, read. Spend at least six months reading before you start collecting data. Make sure you read around any instrument you plan on using. This can often give you a good idea of unanswered questions, which can help you get published (which is important if you want to stay in academia).

6) In total contrast to the last point, start collecting and analysing data ASAP. There's nothing like trying to figure out your own data to help you to understand the methods you are using. If something doesn't make sense, google it and read some papers. Its likely that someone else has had the same problems, and they may know how to solve it. If you cant collect data quickly for some reason, search the internet and start analysing other peoples data for practice.

7) Use R - seriously, if you intend to do any kind of hardcore statistical analysis, use R. Its the best stats program out there, and is constantly having new packages added. Its made me a much better scientist, both by forcing me to learn exactly what i'm doing (to decipher the error messages) and by centralising all of the routines i need in one place. Most psychologists end up using SPSS, some IRT program, some SEM program and various other odds and ends. R does all of this, so just learn it now before you get left behind.

8) Take some time off. I've lost count of the amount of times I've been stumped on a problem, have taken a couple or hours or a day off and the solution has come to me while I was relaxing. Creative thought and hard slog do not often co-occur, so make time for both.

9) Use as many useful computer tools as possibe. Get a computerised reference manager, cos references are annoying. Get a good stats program (Use R). Get a good qualitative analysis program (I'm using Nvivo, but theres probably a good open source alternative). Learn LaTeX, lest you lose a whole chapter to the demons that infest Word.

10) Write, write, write. Its often easier to understand what the problems are once you try to explain yourself. Aim to write a few hundred words a day. Take notes on absolutely everything you read, this will save you time in the long run.

Finally, have fun! Doing research is supposed to be fun, and you can bet your ass that all the greats enjoyed their work. To paraphrase something I heard once: Doing a PhD is like living your life; if you're not enjoying it, neither the life nor the PhD will turn out to be any good.

Tuesday, July 20, 2010

Placebos: All you never wanted to know (Part 1)

Well, its that time of the week again when I cant put off blogging any longer. I have a terrible habit of putting off blogging (which i enjoy) to ensure that i actually complete my Phd. Therefore, I've decided to start blogging about my actual research.

To whit, everybody's favourite sugar pill: placebo!.

This will be a relatively long series, with about seven parts. Essentially, I'm updating my literature review this week, so I'll blog about each section as I do it (perhaps before, if i get really into this series).

Anyway, we'll start with the hard part: definitions. The placebo is something that most people in our society have an idea about, but it's a surprisingly difficult phenomenon to define. That being said, almost everyone in the field has had their hand at it, so there's a lot to choose from.

The first, classic definition is from Shapiro & Shapiro (1997) - the placebo effect is the result of a placebo treatment.
Pretty illuminating eh? The sad part is that this definition was the end of their long and ultimately fruitless search for a good way of describing the phenomenon.

That being said, it has its good points. Firstly, it can account for all placebo effects, it doesnt presuppose any mechanisms, and it doesn't limit the phenomenon unduly.

However, its bad points are legion also, the largest being that its a tautology, and not in the universal truth sense.

Probably the definition most people are familiar with is this one: the placebo effect is the effect seen in the placebo arm of a double blind trial. However, this one also has large problems. The major issue with this definition is that not all of the response in a placebo arm will be down to the placebo.

One thing that can happen to mess up this definition is a funny little phenomenon called regression to the mean. Regression to the mean is a statistical phenomeon that works as follows. There are sick people, whom you select for a trial on the basis of their sickness. Say if the sickness was measured on a ten point scale, they would be a seven. Now, even if the treatment you give them is harmful, it is likely that some of them will report less sickness after a week, because its more probable that the next measurement will be closer to the mean. I'm relatively sure that this could be eliminated with a perfectly reliable instrument, but we don't have any of those (certainly not in psychology).

Warning: previous example requires a normal distribution. If in doubt, consult a friendly statistician ( if you can find one). Update: apparently it only require a distribution with equal marginal probabilities - i do remember seeing an explanation that used the normal distribution though.

Another feature that can cause issues in estimating the placebo effect is the natural history of a sickness. The major problem here is that people's health may wax and wane, and again if you select a person for inclusion on the basis of sickness, the natural history effect could cause them to report feeling better even in the absence of any real effect from your treatment.

So, if you actually want to estimate the placebo effect accurately, you need a no treatment group. These poor suckers are recruited into the trial on the basis of sickness, and then don't get anything to help, except to be poked and prodded by doctors and nurses. Many clinical trials don't include these groups, and its easy to see why. Bad enough that you have to give half the participants placebo, but to give another group of people nothing, thats way too harsh. (We'll get back to clinical trials with no treatment groups later, i promise).

So, following on from this long and rambling excursion into clinical trials, we can update our definition of the placebo effect to as follows: the placebo effect is the improvement seen in the placebo arm less the improvement in the no treatment arm.

So, this is the workhorse of placebo definitions, but it still won't do. This definition requires a particular setting which does not fit where many placebo effects take place. For example, the response shown by a patient to the archetypal sugar pill after a visit to the doctor cannot be accounted for with this particular definition. So, we'll have to move on.

A more recent definition came from Price et al (2008) where they claimed that a placebo was any effect which simulated a treatment.

A fascinating recent study by Oken et al gave us some interesting findings. Essentially it was an RCT which randomised seniors (65-80 years old) to either placebo or no treatment. They were told that the pill would improve their memory, and lo and behold it did. They scored better on measures of verbal and working memory (interestingly enough only the men showed this effect).

This is a problem for definitions of the placebo which rely on the notion of treatment. I can't really see how the effects of this pill could be considered such, they were a neuro-enhancer rather than something to stave off decline. So, it looks like we may have to confine the Price et al definition to the fire.

A definition which can account for the experiment noted above is that of Daniel Moerman, an anthropologist: a placebo is the positive mental or physical effects induced by the meaning of a substance or procedure. He prefers to call placebos the meaning response, which is a much nicer phrase than placebo (or at least has less negative associations).

I really like Moerman's definition (and his book is really very good, even if you're not a specialist). However, there are some weasel words in there, the main culprit being "meaning".
So, boys and girls, what does meaning mean?

Presumably it refers to the interpretation one gives to something, but its a hard word to define, and even worse, its a horrible word to attempt to operationalise (i.e. figure out how to define or measure it). Although, that being said, i suppose we could just substitute meaning for expectancy and get on with our research.

That, dear readers, would probably be letting you off a little lightly though. So, lets move on to another defintion, this one by a wonderful scientist and human being, Dr Zelda Di Blasi (2001) she and her colleagues renamed placebo (everyone loves doing this this) to context effects (which again, is nice and doesnt have negative associations) and said: a placebo is an inert substance which has an effect due to context.

This is nice, it again leaves open the mechanisms and wonderfully enough, doesn't preclude none health related placebos. However, context (to me at least) means what surrounds the patient, and this ignores the fact that focusing on bodily sensations increases the size of the placebo effect

The other issue with this definition is that it somewhat marginalises the role of the person who experiences the placebo effect, as it implies that all the impetus comes from outside, when clearly the internal experience is perhaps the defining characteristic.

Moving on, I think that the term placebo is growing more and more useless. These days, its used by many and seems (in psychology at least) to be a convenient shorthand for the effects of the mind on the body. My first exhibit for this kind of thing is the 2007 paper by Crum and Langer (if you're Irish you probably giggled at that last name, otherwise, carry on), which is called: mind set matters: exercise and the placebo effect.

The study itself is really interesting, they took a large group of hotels, matched them, and randomised hotels to either control or treatment. In each of the hotels which were in the treatment, they told the cleaning people how many calories they burned in the course of their work. In the other hotel, they just talked to them for a while and got them to fill out some forms.

The really interesting part was that the women (i believe the entire sample was female) who were told about their calorie burning habits lost more weight over the next month, and were both healthier and happier by the end of the study. I suppose the take home message from this study is that you should learn how many calories you burn in your daily activities if you want to lose weight.

However, my point here is that the use of the term placebo effect here is confusing and causing problems with our understanding of the concept. I personally would much prefer to have a placebo effect that only related to healthcare and medicine, along with mind/body effects or expectancy effects for the Oken and Crum studies I noted above.

To be honest though, I'm not going to lose too much sleep over the definition of the effect. Having read some of the Shapiro papers where they grapple with the construct over the years, I've come to the conclusion that its a waste of effort and time that could better be spent trying to figure out how to induce the damn thing (whatever we call it) reliably.

Monday, June 21, 2010

On publishing and journals

So, I'm currently writing my first paper for publication. Woo hoo, and what not.

Therefore, I've started to pay attention to things like impact factors. Impact factors, for those of you who don't know, are numbers that reflect how often an average paper from a journal has been cited over the last 5 year period. Think of it as a journals reputation, if you will.

Many people claim that the bigger the impact factor, the better the journal. This occurred most recently in the Chronicle of Higher Education where a number of people moaned about the amount of research that goes uncited. Of course, this doesn't control for the number of people in a field and the "sexiness" of a topic, so obviously its not the whole story.

Now, the other major factor (for me at least) is the time taken to review. Psychology apparently has a long, long time to review journal articles. I've seen many papers that show a two year lag between submission and print publication. Most journals operate a pre-printing service these days, which means that one might only wait a year before others see your work.

So, when choosing a journal, I find myself making a trade-off. Should I go for a lower impact journal that reviews quickly, or a higher impact journal that will take longer, but make my research more visible?

Another point to remember is that you can't submit to multiple journals at once, so the reviewing time is an opportunity cost for the researcher. This is of particular relevance for students like myself, who need to get papers published quickly in order to be able to show them on a CV and thus get a job (and the opportunity to do more research.

I'm still not decided on which route to go, and time is running out. The deadline is somewhat external and somewhat self-imposed. My funders want a report in ten days, and i'd like to be able to claim that a paper is under review by that time. If anyone is reading and has advice, it would be greatly appreciated.

Thursday, June 17, 2010

On the absurdity of marking schemes

So, I'm a PhD student somewhere in the South of Ireland.
Recently, I taught my very first class, which was nice.

Even more recently, I had to mark all the scripts, which was not so nice.

You see, in my University, psychology (which i have been assured is, in fact, a science) is examined like a liberal arts degree i.e. with essay questions.

All well and good, you say. However, the marking scheme - which is handed down from on high - is crazy. And not in a good funny sortof entertaining way but in the hair-pulling chair destroying data falsifying kindof way.

Here's a breakdown of how the marks work:
A (or First) 70-100%
B(OR 2.1) 60-69%
C (or 2.2) 50-59%
D (or 3.1) 45-49%
E (or pass) 40-44%
F (for fail) 0-39%

Now, I'm sure that many of you can spot the issues here, but I'll illustrate anyway. The A covers 30% of the scale, and is subdivided into three (as are the others). However, the A grades are separated by 10% each (75,85, 95) while the E grades are separated by only 1% each.

So essentially what the marking scheme dictates is that there is ten times more difference between the A grades than the E grades. Its absurd, and yet occurs everywhere on this emerald isle (and also in the UK, but don't quote me on that).

The worst part, for me at least, is that the A is rare, very rare in fact, and most of the marks are squashed into the 55-69% range, which gives students a very misleading idea of their relative standing in the class and between classes.

There's a sizeable majority of the scale (A and F) that is used perhaps 3-5% of the time, and everyone else just gets pushed into the dank and unwholesome middle. Now personally, I'd prefer it if the scale was divided into 15 points per grade and if A's were a possibility rather than a carrot used to urge undergraduates to engage in insane amounts of study for very little reward.

Unfortunately, its not up to me, but rather up to the NUI, and I hardly think they'll change it because of this blog post. In the unlikely event that they do change it, i would of course accept recompense from any grateful students or teachers.

Stat Counter