nickharding's posterous

Coding in R: Things I've had to look up.

Not so much a blog post, but more of a personal record of things that I've had to look up. For the purpose of keeping all tips/tutorials/mailing list answers in one handy place.

*********

15/5/2012

Really nice guide/intro to multilevel models in R using the lme4 package:

http://www.rensenieuwenhuis.nl/r-sessions-16-multilevel-model-specification-l...

 

 

14/5/2012

Deleting columns from a dataframe:

> subset(data.jobs$CountCovariates$data, select = -X_cov);

http://stackoverflow.com/questions/4605206/drop-columns-r-data-frame

 

14/5/2012

The %in% operator.

> X %in% Y

Returns an binary array of size X, whether or not each element is present in array Y.

 

 

Posted May 14, 2012

First impressions of Google Plus

So, I have had Google Plus for a little while now and I think I'm getting to grips with it! However, there a few things I'm not clear on...

1. I can't see how circles are an adequate replacement for Facebook groups. Take the example of a sports club, where a set of people need to share information. In Google Plus I suppose the club president arranges everyone into a circle and shares something with that circle, then all the members get it on their incoming feed. This allows people to discuss something shared by the originator, but how can other members then share something else with the whole group? They don't have access to that circle. A one-click option to group everyone on a share list into a circle would be nice, although groups are still a much more elegant solution, as you only have one copy of the group maintained by the most involved members. This makes me think of static vs instance objects in programming.If people have to maintain their own circle then there will be omissions and errors. 

2. Why is there nothing telling me how many unread emails I have in my Gmail inbox, this seems like quite an obvious omission?

3. One of the irritating things about Facebook was (and I guess, is) that when you had a group, people who had no account couldn't be included in a group, or discussion or whatever. Google allows you to share via email, but I'm not sure if people can comment etc. using just an email address. This would be really neat, and at a time where there are multiple social networks, a tool that allows people without an account to participate would be great, after all not everyone you want to invite to an event will have a Facebook or Google Plus account? Why can't a comment stream be interchangeable with a group email?      

4. Google Reader. I use google reader to keep an eye on various blogs and forums, however Reader doesn't seem to be linked up to Plus yet. I am given 3 options when I find something of interest; star, which means I favourite it, like which adds me to some list with other people that have liked it, and share, which shares it with people that follow me on Google Buzz...now, surely Buzz is made obsolete by Plus, and the use of "share" is pretty confusing here when it seems to be the key word on Google Plus. I don't think there is a way to directly share something from Plus from Reader. This could be streamlined  by a sharing option a la Plus, with the option of leaving a trace on the original blog so people with similar interests can find you. Is like different to +1?! Yes...as likes appear only via reader, not on your profile. This is pretty confusing, but I suppose inevitable when Google is introducing a new raft of technologies while phasing out the old ones. 

Filed under  //   facebook   google plus   google+   social media   technology   twitter  
Posted July 7, 2011

Not convinced by Blekko. A micro review.

Quite a lot has been written on whether Google is finally about to lose its stranglehold on web search, including a recent article in the Guardian online. One of the most talked about new tools has been Blekko, that has received quite a bit of favourable coverage in various tech magazines and websites, for example on techcrunch and lifehacker.

Mainly because I don't like the idea of google's monopoly, I've had my default search engine set as Blekko for the past 2 days. However, I have to say I'm not particarly impressed. Although I think the /slashtag concept is a good one, the engine itself is too limited. Two examples follow of genuine use.

1. Searching for a map of Withy Grove Manchester, noticed in the papers that this was one of the UK streets with the highest crime rates. This is the response I got from Blekko:

Blekko2
So, the first result isn't a map, part of a company website based in Withy grove. The second result takes us to a map, but it's a bit awkward from a site called mapquest. However, where it gets really bad is when I add the slashtag \map... I get:

Blekko2a

So, useless. I would expect \map to restict my seach to google maps, multimap, etc. with maybe a preview. This just deosn't work. In contrast google gets it first time, first result, with a preview. And does it faster: 2. Just moved and I have a credit balance with my electric company for my old address. I want to get it back. Entering "Eon credit balance refund" into Blekko gives me:

Blekko1
So, nothing like what I want.  Just junk about credit cards and calling cards. Google on the other hand...
Google1
So, that's it, not scientific by any means but a reflection of my experience with using Blekko. Frustrating lack of what I believe is termed "relevancy". I look forward to when something can challenge google in web search, but Blekko is gimmicky rather than good. Slashtags don't make up for a lack of basic functionality.

 

Birmingham Syphilis

This is Google's cache of http://www.heartofengland.nhs.uk/templates/Page____6949.aspx. It is a snapshot of the page as it appeared on 14 Feb 2010 14:07:58 GMT. The current page could have changed in the meantime. Learn more

These search terms are highlighted: heartlands clinic birmingham syphilis  
Birmingham Heartlands Hospital, Solihull Hospital, Birmingham Chest Clinic

Text Size Acessibility Title
+ Larger Font | - Smaller Font
Page header, an image with various NHS photos.
2nd Nov - Syphilis On The Up In Birmingham

Health experts are warning of a worrying increase in infectious syphilis across Birmingham. Cases of this sexually transmitted infection (STI) diagnosed in Birmingham Sexual Health Clinics have increased twenty fold (a 1887% rise) in recent years: from only 8 cases in 2000, to 159 in 2005.

Furthermore, in the first 10 months of 2006, there have been over 112 new cases reported. A number which is set to rise as further reports are received.

This trend mirrors data from the West Midlands as a whole and other major cities around the UK.

Dr Steve Taylor, consultant in HIV and sexual health at Heart of England NHS Foundation Trust, said: "Syphilis is no longer an infection of the past - we are now seeing several new infections a week in our clinics. It is definitely back on the list of STIs that sexually active people can contract during unprotected sex.

"We have seen cases in men and women of all ages and from different ethnic groups. It is really important that all sexually active people take responsibility for their own and their partners’ sexual health and use a condom with new and casual sexual partners.

"Syphilis can cause serious problems if left untreated, particularly in pregnant women when it can cause miscarriage, still birth or infect the unborn baby. But once identified it can be treated very easily by a course of antibiotics."

Dr Penny Goold, consultant in Genitourinary Medicine at the Whittall Street Clinic, a sexual health clinic in Birmingham city centre run by Heart ofBirmingham Teaching PCT, said:

"The evidence we have suggests that we are only seeing a small proportion of the true number of people infected with syphilis. This concerns us because it means there are people out there who are unaware that they may have become infected and may pass on the infection to their sexual partners.

"We urge anyone who thinks they may have come into contact with an infected person to go to their local sexual health clinic for a confidential check up or to ask their GP. To find out where their nearest sexual health clinic is they can also contact NHS Direct on telephone number 0845 4647 begin_of_the_skype_highlighting              0845 4647      end_of_the_skype_highlighting."

The symptoms of syphilis can be difficult to recognise and may take up to three months to show after having sexual contact with an infected person. The first sign is when one or more painless sores appear at the place where the bacteria entered the body. If the infection remains untreated, other symptoms can develop six to twelve weeks later, including a non-itchy rash across the body, especially on the palms of the hands and feet. Flu-like symptoms are common and sores in the mouth can develop, as well as patchy hair loss.

Syphilis is highly infectious at both of these stages so it is very important that anyone with these symptoms seeks medical help and avoids sexual contact until the infection has been treated. These symptoms may settle without treatment but several years later, syphilis can cause problems with the brain and many other parts of the body, so it is important to seek testing and treatment early.

 

Looping through filenames in STATA

For anyone who has become frustrated when using macros in STATA this is a short guide. The problem in mind is writing code to loop through different filenames and paths in do files and programs, but am sure it has other applications. Those with a programming background may find this obvious, but I certainly didn't...

Macros
Macros are strings that are stored in memory under a designated label. Macros can also be numbers, but STATA likes to store them in string format. They can be used as numbers, but are stored as strings.

Macros are of 2 types, locals and globals.

Local Macros are local in the sense that they are not retained beyond the do file or stata program in which they are declared. 
To set a local macro:

local x = "a string"
The quotes are important here. They are not necessary when there are no spaces or other awkward characters (such as ") present. Details on quotation use in stata can be found here.
 
When we wish to use the macro we refer to it using STATA's special quote marks `x'.

Typing `x' is exactly the same as typing whatever you have set `x' to.
e.g. typing:

display `x'  results in an error as if display astring had been typed
display "`x'" displays astring, just as if display "astring" had been typed.

Numbers are treated in a slightly different manner, numbers stored as macros can be used as numbers or strings. e.g.
local y = 2
display `y'
2
display "`y'"
2

display "`y'"+1 
21
display `y'+1
3

Global macros are retained in memory until expressly dropped or overwritten. Global macros are typically used for filepaths and similar static variables. They are used slightly differently:
global x = "abcdef" (quotes not necessary here)
display "$x"
abcdef

The dollar sign signifies the use of the macro. Again we need to use quotation marks to signal its use as a string when it is displayed/used. Otherwise they are used in a pretty similar manner.

Concatenation (combining of strings).
This is often done in loops, for example when outputting simulation results or reading in concurrent datafiles; file1, file2 etc.

Local macros are easily combined when we remember that calling the macro is exactly the same as typing the string stored.

local x="a string"
local y="is born"
local z = "`x' `y'"

di "`z'" 

a string is born

Other text strings can be easily incorporated into the new macro:
local q = "`x' vest `y'"
di "`q'"
a string vest is born

Global macros can be combined in much the same way:
global filepath "C:\Documents and Settings\username\examples\"
global filename "testdata"
global suffix ".dta"
display "$filepath$filename$suffix"
C:\Documents and Settings\username\examples\testdata.dta

Something that you can't do quite as easily with global macros is the incorporation of a text string. e.g.
global filename "testfile"
global suffix ".dta"
display $filename1$suffix
.dta
Here STATA can't find $filename1, we need to use quote marks to separate the three strings:
display "$filename""1""$suffix"
testfile1.dta

We can combine local with global macros. This is useful when we wish to loop through filenames
forvalues x = 0(1)10 {
use "$filepath$filename`x'$suffix",clear
}

Hope this was helpful.

Categorical data reformatting

I have this categorical dataset, here's a sample:
p,x,s,n,t,p,f,c,n,k,e,e,s,s,w,w,p,w,o,p,k,s,ue,x,s,y,t,a,f,c,b,k,e,c,s,s,w,w,p,w,o,p,n,n,ge,b,s,w,t,l,f,c,b,n,e,c,s,s,w,w,p,w,o,p,n,n,mp,x,y,w,t,p,f,c,n,n,e,e,s,s,w,w,p,w,o,p,k,s,u

I want to convert it into numerical form, then have each different category indexed from 0. So instead of p,e in a column I want 0,1.

Is there a quick way to do this? Without writing a little script?

I guess replace each letter with corresponding #, then reduce down somehow.

Google search prediction

Google

Don't you just love the google search predictor function...
Any other good ones?

GMPTE Consultation

I used to cycle along Oxford road to work, but it is so perilousI have been getting the bus since early October.

Oxford Rd/Wilmslow rd.
Cycling along the Oxford road/Wilmslow road corridor is currently extremely dangerous, and I am pleased improvements are to be made. Most notable issues are:
  •  Through rusholme moving through busy traffic, buses pulling into stops, cars pulling around vehicles waiting to turn right. This is a problem along the whole corridor, but particularly in Rusholme. Even with the marked cycle lanes we have currently, cyclists not protected.
  • The raised cycle path in Fallowfield, although a good idea is frequently blocked by people parking to use the takeaways or delivering.
  • The section of cycle path alongside Whitworth Park is used more by pedestians, which makes cycling (and walking there) dangerous.
  • Many motorists and bus drivers pay no attention whatsover to the advance stop areas for bikes.
  • The section North of the Manchester University campus offers no protection at all for cyclists.  Some connection from Oxford Road with the cycle path that runs behind upper brook st. to Sackville st. would be fantastic.
  • A general issue on Oxford road is dozens of people crossing the road at Wellington square, just south of precinct centre. Some kind of crossing might be helpful?