Sunday, March 11, 2007

Jean-Claude Bradley PAETC07

link to screencast

Man: All right. I'd like to talk to you about research and teaching the blogs and wikis. Because a lot of you are involved with the teaching aspect, and we did actually discuss a lot of the teaching in the workshop yesterday, I'll be focusing on more of the research. But as I go through this, look at some applications in some of the other kinds of teaching environments where you can see that this could apply.

A big picture here is we're doing science - we're doing chemistry - and if you're not in the sciences, or you're not running a research lab, maybe you're not aware of what's going on right now. There is a transition of communication, which used to be human-to-human communication, towards a world where it's going to be machines talking to each other.

Where we are right now is somewhere in-between: people talking to each other, people talking to machines, and what that means, basically, is machines reading scientific information and doing something with it. We don't have to discuss in detail here what that is, but that's sort of the concept.

There are things like the robot scientists, that actually can form hypotheses and do reactions and test them, so this is the world that we're going to. But the question is, how is it going to happen? For this particular talk, we're going to move over here to using the blogs and wikis in answering that question.

We're trying to do science; we're trying to do science that is useful, that is relevant, and we're going to find that out by actually looking at what people are saying is important. That's how this particular project, UsefulChem, started.

We looked for phrases like "what is needed now," "what is missing is," and found that basically malaria was one of the things that was most often cited. So we did a project on malaria. We're trying to do the entire project completely transparently, using blogs and wikis. What I'll discuss is the evolution of that, and how we used each technology to meet our objective.

One of the things that we came across are other groups that are actually producing information that other people can use. Find-a-Drug is an organization that had actually tested some molecules to see whether or not they have a good chance of inhibiting one of the enzymes for the malaria parasite.

They gave us their collection of molecules, which we set out to make, because we're a synthetic organic chemistry lab. So this is one of the components that's coming together to make this project happen.

As far as the chemistry is concerned, this is not really the right place to discuss it, but the planning of the synthesis and discussing the pros and cons was all done through a blog, and that was done openly. If you're doing chemistry, you're basically having to deal with molecules.

So one of the ways that we started to use blogs is by actually having one post per molecule, where we would put information about the molecule - for example, a link to its commercial availability, a link to some of the spectroscopic properties of the molecule. And you'll see this little "SMILES" here; this is a way of representing molecules that is often used today to search in databases.

Remember I was telling you about the automation part, machines interacting with it? I think that using social software is a good way to bridge the gap that exists right now between machines and humans.

A second blog that we started - and again, we did all of this not as a big planned strategy, but basically we just created what we needed to, to reach the objectives that we had, or to overcome some of the problems. Initially, I thought that it would be good if we had an experiment blog, where every experiment was a separate post and we could link specific molecules that we used back to the molecules blog.

In fact, that was kind of interesting because we got comments from people; a few comments from people around the world. This particular comment is about this reaction that actually is not even finished yet, and he's talking about how maybe the concentration isn't good, maybe we should use a different concentration.

Normally, if you're not familiar with science publishing, this really never happens. Normally you do all your work, you completely get it done, then you submit it to a publisher, and then it goes through the peer-review process, and much, much later down the road do people even know that you've done it. So by posting our experiments directly to the web the same day that the experiment is done, we can get feedback. That's what we're trying to achieve here.

Now, that worked for a little while, but then it became obvious that the blog format, as we discussed yesterday in the workshop quite extensively, that because the blog format is really strictly chronological, it's very good for discussing time-sensitive issues, but it's not very good for organizing.

So I created this UsefulChem wiki to basically link to specific blog posts. We still use the blog for new items, but we use the wiki to organize that information. That actually worked out pretty well.

One of the very next things that we could do on the wiki, in terms of organizations, is actually talking about our failures, which is not something that is often discussed. It's pretty hard to publish things that failed, in journals. It is possible, but normally you do it in the context of a larger project, and it's not something you want to dwell upon.

We do want to dwell upon it, because the wiki is actually our laboratory notebook. Everything that is done in my lab is actually recorded, so we have to address the issues. If this didn't work, what's the reason it didn't work, or maybe there's a mistake that happened.

This a page on the UsefulChem wiki where we tried to make this compound, and it turns out that there were errors in the peer-reviewed literature that led us astray. There were issues like that, that we're actually able to document and say, "We were doing this based on this report; turns out that that wasn't actually accurate, so this is what we're finding."

And then finally, experiment 25 is when we had the whole thing working completely. This is something that any chemist can use to make the compound that we're trying to make here.

Woman in audience: What level are the students that you're working with?

Man: The students in the lab ranged from graduate to undergraduate students. This is university level.

Now, the experiments I was telling you about, where every post is a separate experiment, the problem with that is that there's no history tracking in the typical blog. So a student would write something, but then they would get a comment, and then when they would fix it there was no trace that it was ever made differently. There was no trace of how it was actually fixed.

So we found that actually having a wiki page as an actual experiment is an extremely effective way of tracking who does what, when. You can also track what we knew at a certain time. So if we're talking about precedents, I think wikis are actually far superior devices than blogs.

For example, as I was just saying, in terms of who does what, there's a "History" button, and it tells you who has done what. This is my grad student, and I can click on this to see exactly what he contributed at this specific point in time, and I can keep going back, over and over. Even though you're not doing research necessarily in chemistry, you are doing research in other areas, and this is really a good way to track and to communicate with your students.

This is an example of what it actually looks like. If you're not familiar with the wiki interface, I use Wikispaces for a number of reasons that we discussed yesterday. One of them is that it has a really nice way of representing what got changed. The red is the text that got deleted, and the green is the text that got added.

In this particular example, I had put y millimeters, z percent yield, because the student had not put the percent yield. Instead of me calculating it and putting it, I will typically ask a question or say, "Something's missing here."

So that's a good way for me to interact with my students, more as a mentor than a fact-checker. It takes a little bit longer for the information to get accurate, but I think it's worth it, because that's really my job as a teacher: to teach them how to learn and how to tell if things are right or not.

Again, a very important thing: if we're talking about science, it's a really big deal for you to say that you are the first to do something, or that you knew something at this particular point in time. And there again, the blog is not a particularly good way to do that, because on Blogger, for example, you can change the date to any date if you want to. It's pretty difficult to find out if this post actually occurred at this time.

But if you use a third-party wiki like Wikispaces, you can see exactly the date and time of that entry. So you can link not only to the wiki page, but you can link to the specific version of that page. And you can do that pretty much on any wiki, I think, and that's an excellent way of resolving debates, which are bound to happen.

So, I was showing you a specific page, you can tell the difference. In the wiki you can actually click on a button that shows you all the changes that are recent. So in this case, on a day-to-day basis, that is how my students and me interact. We check on the recent changes button. I'll see that experiment 25 got updated. I'll see what they did, and I'll respond. That's a pretty good way to interact when you don't have a lot of time to meet face-to-face every day.

The other thing that I think is very very important in terms of finding out about your audience and finding out how your work is being received. Normally when you publish something in a typically paper peer-reviewed journal, what happens is that you'll know who has cited you, but you won't know who has actually read and why they've read you.

And when they cite you you don't know if it's good or bad. They may actually be citing you because you did something stupid, and that's the reason you get a high citation.

The advantage to using social software is that you can put things like Sitemeter, which tracks how many visitors you have a day but also the keywords they are using or the websites they are using to make it to your site. So on this particular day, I can see that I got one referral from another chemistry blog site, and I can see that someone searched on Google for the chemistry of protease inhibitors, which is actually an HIV issue that we discussed at one point.

So this is something that I look at every day to see what's the impact that we are having? How are people finding our materials? Sometimes it's surprising to see - you'll see people looking for a boiling point. You would never just publish a boiling point. But because we are recording everything we do, we record it. And that's useful.

So even though it is not a complete thought, it is something that can be used by other chemists. And they do use Google. You do see people looking for boiling points on Google. So it's definitely a vehicle that I think is going to be used more and more.

[unintelligible question]

The question is about Sitemeter. It is actually a third party. And the free version lets you view the last 100 entries. You can pay like $6.95 a month if you want to get up to 4000. But all the tools I have shown so far are 100% free.

I was talking a little about the automation part. So, again, we don't have chemists in this room, but basically Ng is a way of representing a molecule that is considered to be a very good way because it is a unique way of representing a molecule. And so we have agents for example that will read our blog and will then calculate the Ngs for the compounds and publish them on another page. Every day that happens.

So if you go on Google and you search for the Ng, it will pull up our wiki pages, it's going to pull up our blog pages, wherever that particular string happens to have occurred. That's what's exciting in terms of chemistry. You can do something as a human being.

And then without even your knowledge, you can have these autonomous agents start processing, figuring out if this has been published before, if anybody has done this before. So this is pretty exciting. You can do all kinds of things by producing 3D structures that you can rotate.

There is a thing, if you are familiar with RSS, you know if you subscribe to a blog you are doing it with a blog reader. There is something called CMLRSS which is chemical markup language. The reader of course has to be able to understand chemistry, so bloglines wouldn't do it, for example.

It can actually read the text as a molecule. Then it can do things with it. That is something that is relatively new and it's going to be used more and more. Again making that human-machine bridge. If you do use bloglines, it will still work but it will just ignore the CML.

One of the things by going through this project which has been really interesting is that it is a completely bottom-up approach. There are other people around the world that are interested in doing science in an open and transparent way. We kind of meet each other naturally because there are their blogs and wikis out there.

So the organization is trying to coordinate scientists interacting with each other. One of their biggest projects is on malaria. It is actually a commercial enterprise, where they actually looked up information for us in exchange for having a link. So it's not just people who are doing it without profit. There is every motive out there; there are ways for people to interact to get things done.

The other thing that is very useful about actually having the work completely publicly available is that you can have collaborations. For example, with Beth at Lehigh Carbon Community College. She teaches English, and we have been able to do a little collaboration where her students can look at our malaria blog and be able to look at the humanities components of malaria, for example, or of the issues.

So that is something that you can only really do if you do operate transparently because it is already there. You don't have to do anything special. You don't have to add accounts. You don't have to control it. If you see some patterns here that are useful you can certainly contact me and we can show you how to do this.

I was telling you that the blog wasn't that great for recording experiments, but it is really good for recording your milestones. The main UsefulChem blog - basically, if something interesting happens that a good portion of the population can understand, I will put it.

That doesn't mean that 100% of the people will understand all of the posts. Some of them you do have to be a chemist. But some if it you can probaly figure out what's going on roughly. We are trying to make compounds, if we are having a lot of trouble, I will explain why we are having trouble. If we've made it, I'll be happy and give you a post that we made it.

So you might not even know what we've done, but at least you can see roughly where the project is going. So that is where the blog fits in really nicely. And then of course, the wiki has really evolved to be a very good laboratory notebook.

You can do raw data on here and there are other people that have other ways of communicating their raw data. That's a whole other issue. So you have other blogs like Daily which actually do publish some of their failed experiments. Not all, so it's not an actual notebook. But there is some of that going on.

You have people discussing vendor reliability. That's not something that you usually read about in the journal work. But is extremely important as a chemist to know not to order from a particular vendor.

There is open webware - I don't know if you have heard about this. This is a pretty big initiative from MIT to basically do science using wikis. But here people are pretty selective about what they will actually share. There is an example here for intragroup communication.

So that this is public, there is a diagram here and a little explanation. But you really have to be in the group to understand what this means because there are a lot of things that have not been explained here. We try to do it so that our wiki pages are very complete.

If you are a chemist and you happen to fall on that page, you can absolutely understand 100% of everything that got done. You can look up the raw data to see if it really does support the claims that we are making. You can link out to understand our motivations. So there are these components that are flying around but I think it really becomes powerful when you integrated everything.

Another blog here our research where it discusses hypotheses. She doesn't really put raw data up. But she likes to talk about "I think this is going to happen," and she gets from good feedback from her colleagues that way.

And I would say probably the greatest collaborations that we have been having lately are really from the automation part. There are people from around the world that are coding for chemistry, and we are contributing to that and we are using all of this open source stuff that is going on and these are some of our collaborators. I think that over time, this can spill over to the actual chemical research, but right now it is mainly the coders who are collaborating.

And this called, in general, "open source science." There is an article over the summer in Chemical & Engineering News which is the flagship journal for chemists. These are my two students here. This is actually really important, because the fact that open source science appears in the C&E News means that it is moving towards the mainsteam.

It means that people who are not just freaks on the border are thinking about there is something to this, there is a reason for doing it. And it can be something that is very useful. I think that that's very important.


Post a Comment

Links to this post:

Create a Link

<< Home