Sunday, April 13, 2008

ACS Talk on Cheminformatics in Open Notebook Science

link to ACS Talk on Cheminformatics in Open Notebook Science

Jean-Claude Bradley: ...so yes, I'd like to talk to you about Open Notebook Science, specifically, the role of cheminformatics in terms of storing information and in terms of retrieving it. So first of all, a little definition, what do I mean by Open Notebook Science? Well, if you look in the past couple of years, there's been a movement towards making the scientific process more open. I'd like to use this little chart here to show where we've come from and where we're going to.

In the traditional lab notebook, for example, everything is unpublished unless somebody makes an effort to put the date together and then send it for publication. But if they don't do that, everything including the failed results are not going to make it to anybody. All right, so as we move along this area here, we have traditional journal article, so in that case, it's less closed and that more people can have access to the information. But again, it doesn't include a lot of the elements such as the failed experiments or the view of all of the experiments that have been ran in a given project.

Something has come up lately, a lot has been open access journal articles and here, we're talking about articles that could be access for free by anyone and, typically, the author or a third party will pay the cost of that. That's good, it's making things even more open, but again, these still have the traditional journal article format. So what we're talking about here in Open Notebook Science is full transparency in the scientific process where, actually, the lab notebook of my research group is made available in real time using a few different technologies which I will discuss.

OK, so to give you sort of background of where we are, my group at Drexel--we're Synthetic Organic Group--and by doing these things openly, we need collaborators, obviously. One of the greatest benefits of doing Open Notebook Science has been to find some great collaborators. Rajarshi here has been a very active collaborator of ours; he's been doing docking for us. We've had people who've tested our compounds. Lately, Phil Rosenthal has been testing our compounds for anti-malarial activity. So most of what I'll be talking about today is in the context of anti-malarial agents. We're also tested a few compounds for anti-tumor activity.

What I'm going to do in the later part of my talk is give you screenshots of the various tools that we use and how they fit together. So we have a blog, we have a wiki, we use Google Docs, mailing lists, we use ChemSpider, and we use CDD and all of these talk to each other. You'll notice that all of these are free-hosted services and that's really important for me. If we have people who think that what we're doing is a good idea, they should be able to replicate it for no cost and with minimal effort and because the services are free-hosted, you can in fact do that.

I'd like to start the story with the blog and the blog, we typically report things such as milestones or larger problems that we've been having. There are not any hard core experiments in the blog because that would be very monotonous and that you wouldn't read it. So we put things that are more interesting to a broader audience.

Here, we're targeting this enzyme falcipain-2 which Phil Rosenthal is testing for us, malarial enzyme, and I'm talking about all kinds of things. Here, I'm linking to EXP150. Now, this is actually a link to the wiki that is the lab notebook of how those compounds are actually made. So what I'm talking about this compound here from this Ugi reaction and here's a nice picture of the crystal of it.

So when I click on that link, it takes me to a pretty long page, I'm going to look at different sections of it and show you where information come from and I'll try to focus more on the Cheminformatics aspects of this. Essentially, if you're going to this page, you should be able to link to the summary post that sort of explains the bigger picture of why we're doing this. Someone might have fall on this page just by doing a Google search. It has a whole explanation of everything including the molecules.

So if you want to click on this link, this Ugi [indecipherable], it will actually take you to the ChemSpider entry for that compound. Tony has already been through how great ChemSpider is and it is and it's got all kinds of information that his [indecipherable] catalates that I don't have to and that's a huge benefit for group like mine that doesn't really have a lot of computer science people. I mean, we're Synthetic Organic chemists.

So we can also link to the Experimental Plan of the experiment and we can link to the docking procedure. Rajarshi was talking about storing the procedure in such a way that other people would be able to reproduce it and that's where we try to use as much as possible. Again, this is just as a wiki page so we're linking to the library; we're linking to information on the enzyme that we're docking against. Rajarshi actually wrote this and these are the results.

If we click on these results links, we end up with a Google Doc that has just the list of SMILES in the order in which they were docked with the enzyme. So this is still falcipain-2 that we're looking at and Rajarshi is saying these are the top ten compounds that we should probably think about making. That's what we're going to do and we've been getting feedback from the people doing calculations to decide which compounds to make next from our virtual libraries.

We also have a Procedure Section, OK, so they're somewhere done around here. The idea here is to write up the information in such a way that it could be quickly copied and pasted when we submit some of these works for publication in a traditional article. What I'm talking about here is not a way of bypassing the traditional system; it's just the way of getting the information out there much more quickly. We're still following that plan of submitting to traditional articles and this is, of course, what the format would look like.

Again, part of that page, there's a Results Section and here, instead of just linking to the NMRs as PDF, we actually link to them in JCAMP format and we use JSpecView written by Robert Lancashire. It's a fantastic free program that allows you to do an NMR spectra or any spectra in JCAMP format in a browser in a way that can be expanded very easily. So this here is what the spectra might look like in a PDF in a supplementary Information Section.

I know a lot of you probably know it, you want to find out what the NMR look like and you get this PDF that was scanned and you can't tell what the picture look like. But if you have actually the raw data in JCAMP format, you can easily expand this little peak and see that yes, it is a triplet and you can measure the [indecipherable], you can do whatever you want with it. That's really, really important if you're making a statements about things, people has to be able to verify your raw data, they have to be able to make the same conclusions based on the same information.

JCAMP is really a nice little format to use; we've done a couple of things with it. I had a [indecipherable] student last year write Excel VBA, where we basically monitor reactions. We would basically give the start time of the experiment and then Excel VBA will be able to run and it will calculate all of the different concentrations over time.

Of course, you had this on the peaks, you had to tell which peaks correspond to which compound. But if you've done automatically print out a kinetics-run for the disappearance of some compounds and for the appearance of other compounds. So those are some of the neat things that you actually cannot do wit.pdf of an NMR. So it's very, very useful to keep it in that format.

And as you saw Tony talk about, we've also been using ChemSpider to characterize compounds. So the difference here is that these are spectra that I approved of; these are final isolated compounds. But of course in research, you take monitoring runs, you take impure compounds, you're in the process of purification. And so all of those NMRs and spectra have to be accounted for as well. So right now, we're only using ChemSpider to store the best stuff, the stuff that we would send to a paper.

And the next thing that we're about to do - we've almost got this done - I'm working with Andy Lang, and we're using Second Life to actually display NMR data. And this will read JCAMP, and it will do so in a way where you can actually expand the spectrum. So right now, this is actually working, but for a spectrum of a fixed X-axis.

We're working so that you can talk to the spectrum and tell it to expand. And we've almost got that done, and I think it's just going to expand the capabilities of Second Life. I'll be talking about Second Life tomorrow. Actually ACS Island is going live tomorrow at the SciMix, so hopefully I'll see some of you there.

Another section of this page - again, I'm going through this long, long page that's one experiment - is that there has to be a log. So we can actually construct the rest of the experiment based on the log, but without the log, we can't do anything because you can't remember what you did, exactly when you added things, exactly the way you measured them.

This is something that I try to reemphasize to my students: It's absolutely critical that you keep a proper log because later on, you can do whatever you need to do. So when I say that our experiments are in real-time, what it means is that the log has to be up by the end of the day. The other sections of the experiment can take weeks to actually get uploaded.

So finally, when you come to the conclusion section of this article, it says that the CD product was obtaining 59% yield. You don't have to take our word for it at all; you can go back and reinvestigate every single aspect and the arguments that we made.

OK, so now comes storing and retrieving information and this is where the Cheminformatics comes in. So in order for us to retrieve compounds in experiments, it's been a challenge. We've used a tag section, so at the very bottom of each blog or Wiki page, there will be tags. And we can use a number of things for this: We can use SMILES, for example. But we chose not to do that because there are multiple SMILES for a given compound.

So we've been using InChIs and that's worked well for small molecules, but the reality is that for very large molecules, like our Ugi products, the InChIs are not indexed properly by Google. So what we've started to do in the past couple of months is use these InChIKeys. And we use ChemSpider to provide the service to generate those InChIKeys based on either the InChI or the SMILES that are submitted to it. And these links here of the common names, those are for human readability, and when you click on them it takes you to ChemSpider.

So that's how everything is connected together, so that when you do a Google search with this partial InChIKey, it will come up with all of the different experiments where we used that compound. And if you're going to do this, you'll have to remember to click this 'Repeat Search with Omitted Results' because Google will assume that you're not interested in results if it comes from the same domain name - but we are because that's the point of this, we want to get all the experiments.

Now you can do a lot of tricks with Google. For example, if you're familiar with the Google Co-op, also called Google Custom Search, you can take all of our blogs, Wikis and all the pages that we've generated and create a special Google search that will only look on our approved pages; and if you do that then you have a way of searching a very rarified part of the Internet. So all that is available, totally free and anybody can do this.

So how are people actually finding our experiments? Well I've yet to actually catch them when using an InChIKey to find an experiment, so this is something in moving forward that we're going to be doing. But I think it's very, very interesting to see how people are actually finding our experiments through Google.
It falls into four different categories: It could be specific compounds. So someone might be looking for the NMR of TFA. If they do that they're in luck, we have lots and lots of NMRs of TFA. It could be a molecular formula. They could just be searching for T-guanidine, just tell me everything that you know about it, and I'm sure they're going to find ChemSpider hits with that. But people are clicking on these links and they're finding them.

It could be experimental conditions. And this is actually really important because a lot of this stuff is like side reactions of amines. If you search in the traditional literatures, side reactions, how can stuff not work, you typically don't find that there. But of course, most of the typical lab book is almost all failures, so you're going to find lots and lots of stuff that doesn't work, and that's the point. If someone was searching for kinetics of the Bak protection, they will also be in luck; we have lots of kinetics analysis of that.

So the other thing that they can find out at a higher level is I talk a lot about educational things. So if they're looking for free downloading chemistry video, we've got that. If they're looking for 3-D periodic tables, it's something that I discuss, and people are looking for that. And also, some people are looking for bigger pictures, like zomal targets, Skinner formatics, project proposals. So I've discussed the proposals that I've put in and that's great, somebody who was looking for that would have actually found it. And of course, you can also search these experiments just by the traditional table of contents file.

OK, so now if you want to use this information in a much more meaningful way, we want to be able to compare different experiments. And as you keep doing experiments it becomes more and more difficult to keep in mind everything that's been done. So again, we're using tools as they become available to us. If somebody comes to me after my talk and volunteers to run a database with this, I'd be very happy. But right now, we're using Google Docs because it's simple and we don't have to have a lot of results to track, and so it makes sense, but of course, at some point, it should be imported into a real database.

But basically here, each one of these rows is one of these experiments of these Ugi reactions. And one of the things that we observed is that sometimes we get a precipitate and sometimes we don't. If you get a precipitate that's great because you can just filter it and you can scale it up, and you can do a lot of things. If you don't get a precipitate, you would have to run chromatography and that would really complicate things and make it very difficult to scale up.

So one of the things that we've been looking at is can we predict which Ugi product is actually going to precipitate. So in order to do this simply, everything is put into this table that is publicly available, and then people are free to run models on this. I'll talk about that a little bit later, well now actually.

Research has built models to actually predict, going forward, compounds that we have not tried to make yet. He's predicting right now - in the list of 100 compounds that we're scheduled to make - these three should be precipitates. We've also collaborated with other people, MSA Analytics. They've actually just this morning, run a model and co-predicted that this compound should precipitate.

But I predicted another compound that Rajarshi did not precipitate. So again, as he was talking about it'd be really nice to have ways of comparing models to each other and to do that in a very systematic way. As long as we're keeping this fully open, it makes it very easy for people to participate and can collaborate with us.

Now going forward, I show you that log and there are no rules for that log except that my students have to record what they do, when they do it, and what they observe, but they're not required to type any special words or anything. The problem with that is it makes it really difficult to convert that into a format that a machine could use or you couldn't readily extract that and put it into a database. One of the things that we've been doing in the past few months is actually rewriting our logs in a format that should be machine-readable.

Here, I have a series of steps, workflow actually, where you're allowed to add something, you're allowed to wait, you're allowed to vortex, you're allowed to take a picture but you can't do anything else. We have words that we've defined meanings certain things, we have parameters that we specified. For example, we specified the molecule using the InChiKey and we'll also specify with the common names so it's human readable but, essentially, that's just something that we decided to do that way for a bunch of reasons. But there's not reason that it has to be done this way.

If somebody wanted to convert this and have the SMILES, they could do that easily. They would just scrape the information and then convert it. So this, I think going forward, is going to be very important especially as we start to gather more and more information.

So looking at these results, we can also compare them. This is all the different than the table that I show you earlier where we're looking for precipitates or no precipitates. These are actually individual results from experiments that are stripped out and left to stand on their own. So we would mix compounds together and then wait four hours and take a picture. That's not the whole experiment, that's just the first data point. Then we will take an eight hours, we will take a 12 hours.

Now, if something bad happens on the 12th hour, let's say the student drops the sample, well, we will call that an aborted experiment. Everything in that experiment probably wouldn't be worth going through and digging through it. But if we extract every individual result as something that's addressable and minable, then it really doesn't matter what happens down the road.

If somebody who may not be interested in the Ugi reaction whatsoever, may not just be interested in all the reactions where an amino has reacted with an Aldehyde, they would find that here. They would not get any interpreted information; they would just get a picture of what that looks like. What does this look like I mix it together and wait four hours? So that's a strategy that I've been using to get a lot more information that's going to be much more machine-readable and machine-friendly.

A few things about a wiki, when we first started to do this project, we used the blog actually to record experiments. Then it turns out the blog is really not a great tool for this because if you change an entry, you have no record of it. You can't tell if it was changed, you don't even know who changed it. With the wiki, you can see all the recent changes in it; you can look at a specific page. So this is EXP150 that we've been looking at through all this time. I can see all the different versions and I can see who made the changes. I can compare any two versions and using wiki spaces, the new stuff shows up in green and the stuff that was deleted shows up in red.

I can use the wiki as a way to organize results, to explain something that is extremely difficult to publish - failures. So here's a little story about the synthesis of DOPAL and we tried to make this compound. We eventually did make it but we failed trying to make it in some really interesting way. This is just the story of that and it's got links to the papers we tried to use that had wrong information. It's just basically explaining all of that and I think that that can be useful for synthetic organic chemists. It's typically not something that you make public.

We also use a mailing list which turns out to be really handy for collaboration between groups. My group would use the wiki almost exclusively, we'll only collaborate with people who do docking, with people who do testing, the mailing list appears to work pretty well.

The other piece of this that we've just started to tap into is CED collaborative drug discovery. Drexel University now has users on here and this is a way for us to basically store or retrieve or ask a results. It turns out that two of our compounds and active against falcipain-2 and are active to prevent the infection of malaria. They're not terribly active, nor are they less active than chloroquine or the best agents that they have, but it's a start, so those results are stored here.

Something really neat that actually has happened just last week or two--so what's the point of making all of these stuff towards the public? Well, you get contacted by people that you totally don't expect. Brent Friesen at Dominican University, he runs the Sophomore Teaching Lab and he was interested in having his students do something more interesting than just repeating experiments that they've been repeating for 20 years that everybody knows the answer to. He thought maybe the Ugi reaction might be useful for that, so he contacted me and we talked for a little while and we determined, "Yes, this will make sense."

So he just wrote his manual for the spring for Chem 254 and it is the Ugi reaction and his students are going to be doing new reactions that we haven't done yet and we're going to be testing those compounds against malaria. That would be really neat to be able to include students taking regular teaching labs; it's a whole untapped resource. It requires some more time, it requires some more dedication from people who run the teaching sections, but this could be very, very interesting and it could be very motivating for the students.

There have been some other people doing Open Notebook Science, students from Gus Rosania's group was here earlier. He's a collaborator with us; he's going to be studying the drug transport [indecipherable] for our compounds in the parasite and the red blood cells. Cameron Neylon over at Southampton has also been doing Open Notebook Science. He doesn't use a traditional Wiki; he used a modified blog for this.

I just like to end on this slide where I think Science is headed. We've been living in a world where Science has the only point of communicating it was really for other humans to understand it. We're getting into this really interesting time now where we can have actually human beings collaborating with machines if the human beings choose to make information available to them. I think that's how we're going to get to this point where we can have machines actually doing real science, formulating hypotheses, testing them, analyzing the results, and then planning the next experiment.

I think to get to that kind of situation, we need to have free services, we need to have a possibility that anybody in the world can write a script that's going to try to process information and spit out something useful. Hopefully, this is one way that we can actually get there and, I think, it'd be very useful once we do.

So that's it.

ACS Talk on Teaching Chemistry with Second Life

link to ACS Talk on Teaching Chemistry with Second Life

Jean-Claude Bradley: OK. Thank you for the opportunity to talk about Second Life and teaching chemistry. What I would like to do first is step back a little bit. You know, we've been talking about using a lot of different technologies. And sometimes, we have to remember what we are trying to achieve with these technologies. Right? That lets us know whether or not we're wasting our time.

So, I'm just going to take a few minutes to go over what is the roll of the technology teacher. I think that as technology changes, we have to remember that we still have to make chemists. That's our job ultimately, right?

And what does this mean? Well, for me, it means that we are trying to make individuals who are chemically literate at the undergraduate level and competent to create new useful chemical knowledge at the graduate level.

So, any tools that enable us to do that, we should use. Any tools that we are currently using that are no longer effective we shouldn't use. So, what does this mean in terms of actual activities from the teacher? So, how do we actually teach?

We have to do three things. First, we have to select and build our content, then we have to assess and validate the skills and knowledge of the students to know if they learned it or not and third we can help catalyze the learning process.

This is the third one here that I have basically put Second Life into. This is the stuff that you can only do if you have enough time to do it. So, I need to explain how the rest of my course operates to show you how it is possible for me to spend time on this.

So, very quickly, I'm not going to talk at all, or very, very minimally about blogs or wikis or things like that here. I just wanted to give you an idea of the evolution of my course over the past several years.

I teach undergraduate organic chemistry. I started off with, of course, face to face lectures using paper hand outs. Then graduated to using optional screencasts of lectures, which I'm recording this a s a screencast. Then I went to audio podcasting, video podcasting. I started using blogs, wikis, Google video, You Tube, Google co-op.

Finally now, I'm really not even using podcasting in my class because everything's already archived. So, it turns out its far easier to just give the students a zip file and just have them unpack it, because I'm not currently recording lectures. Let me show you why I'm not currently recording my lectures.

This is what a screencast looks like, if you're curious. There's several different formats. You can have flash, this happens to be the m4v format or mp4 on iTunes. The students just basically click this of they can expand the screen and they click play and they can hear me explain the chemistry.

So, it's very, very similar to being in class and watching me doing what I'm doing here, right, just presenting. So, what happened is I actually plotted the attendance in my class over time and observed that in all of my classes the attendance by the end was between 10 to 20 percent.

So, when it ended and I looked at the performance of the students who were still coming to class versus the students who were getting the class material exclusively through screencasts and it was identical. So, that's one of the things that I've stopped doing then is actually lecturing.

Now, instead I just assign the recorded lectures and I do other things with the time, which I'll get into in a second here. OK, so while all this is going on in terms of content and delivery, I'm also changing the way that I'm assessing students.

Obviously, I started with manual grading, then moved on to using WebCT quizzes, voluntary, then making the exams and the tests all mandatory on WebCT. Then, I moved on to doing extra credit assignments using blogs and wikis. Quizzes in first person shooter games like Unreal tournament, quizzes in Second Life, and finally student's assignments in Second Life.

I will obviously focus on the last two, but I will show you a little bit of what these things look like on the way. So, this third part that I was talking about earlier, this learning catalysis what does that mean? Well, sort of by default it's your office hours and email and the time that you're not in class.

Because I now assign my recorded lectures I have time to do things like be present while students do problems, while they watch lectures, while they play games, while they use Second Life. We can talk about the extra credit assignment. I can spend more time on that. I can address technical issues very, very important where they bring their laptop to me and I can actually see what the problem is.

You don't want a situation where you're assigning archived lectures and the student has some kind of technical glitch and they can't do it. That would be a nightmare.

This is what I end up spending my time doing in my workshops.

OK. Student assignments: So, starting with blogging, I don't do blogging anymore, but that's how I started. Basically, I just had them pick something relevant to the class and blog about it, all right, so this was a couple of years ago.

I then moved on to using the wiki. I think, it's a little more useful because you can modify the text and you can see all the different versions. What I do here is I basically have them do an assignment based on my research wiki. We are making anti-malaria compounds. So, they can actually see our lab notebook and they have to make some kind of comments on that on their projects.

We're using J-camp formats for the NMRs, using J-spec to view the NMRs and everything. I obviously, don't have enough time to get into full details about but if you ask me, can wikis be used for education? I think yes, and I've got some pretty good examples I think of having students use them in assignments.

Let's talk about games a little bit. I've been using games for many years, even before WebCT and everything. This is a game that I played a long time ago called Wheel of Orgo, where I start with a starting material and I put a final product and then students take turns trying to come up with steps.

So, they have to put the reagents and they have to put the intermediate. And they get points for having that correct. The idea is, of course, to find a path from the beginning to the end. That's an example of a game. Something where it's the same material, it's just delivered in a different maybe more entertaining format.

Another kind of game that I've used is Unreal Tournament. If you're familiar with that, it's a first person shooter game. There actually is an educational version of the game that doesn't have any weapons.

The way that this one works is you've got these doors and they're just images and they're either correct or they're incorrect. So, if you walk through a correct door, you make it to the next room, if you walk through an incorrect one, you have to start over. You could do races like that.

So there's no injuring or anything like that. Of course on the full version with weapons, there's lots of blood going on. We didn't use that one too, too much, but it certainly was available.

Now, I would probably still be using Unreal Tournament if I hadn't come across Second Life. Second Life actually is a much better infrastructure for doing this kind of thing. Here instead of having rooms, we basically have obelisks. So, if you go on Second Life, and you're going to see lots of pictures here, so you'll get the idea.

This is me, by the way, and this is what I see as I'm moving through the landscape. So, what happens is students click on this obelisk and the same four images pop up that I had in Unreal Tournament. And when they click on it if it's correct they get another set of four questions. If it's incorrect they have to start over.

So, I can actually have many obelisks in a room and I can have many students competing against each other. That's what I refer to as my races, which I do a couple of times per term and I give out prizes. So there's no grades here, it's basically just additional stuff the students can do.

Now that's the stuff that I did. What about if my students are doing anything in Second Life. So, I have had students do some extra-credit work, and one of the things that we have the ability to do now is to actually create molecules, starting from smiles or inchies, and we just talk in Second Life, using the chat box, to these little rezzers that are built to do that.

And here's an example of camphor, showing a pretty complicated example of chirality. So, these molecules are actually much bigger than the size of our Avatars. And you can walk around the molecule, you can fly around it, you can see it in a way that's completely different than you would in a small model, which you're limited to in real life. So, that was a pretty good assignment.

And we can have fun with it here. You can actually fly around on the camphor molecule, and the students can engage in that way. So, this is actually on Nature's Island, called Second Nature.

Right now, what we're doing is we're trying to work on displaying spectra, especially NMR spectra, in Second Life. And I'm working with Andy Lang, and we've almost got it to the point where we can actually talk to the spectrum and have it expand. Again, we're using JCAMP format, and if we can get it to work with JCAMP, we can get it to work for NMR, or IR, or whatever. Because a lot of the equipment can add spectra in those formats.

Other educational things that you can do. Well, you can actually demonstrate docking. So here this is actually the receptor site of enoyl reductase, which is one of the malarial enzymes that we're trying to inhibit. And you just walk up to this molecule, you click it, and it slowly meanders down and fits right into the docking site. And you can walk around it and you can see what's going on. It's actually a little bit hard to see where the hydrogen bonds are, but it's a good exercise nonetheless to try to see it.

Now, if we put the entire enzyme here, we would run out of prims, which are the fundamental units. So, we only put the pocket. But if you want to simplify the enzyme, you can actually create the entire enzyme, but you don't have every atom here. Right? You're just showing it in a simplified mode. Peter Miller has actually been very active with the constructions of proteins in Second Life, and you can go right from the PDB file to this 3D format. All right?

So, things are getting a lot easier for chemists in Second Life. There are more tools, and again, if you can get the PDB, or the inchie, or the smiles, you're in really good shape now for actually just doing these projects.

Other things we can do. Well, we can actually see how chemical reactions happen in 3D. So, here's an example of an amine formation. I've got an aldehide, I have an amine. So you can see the aldehide here, the red oxygen, here's the nitrogen amine. And you actually talk to the chemicals. So you come right up to them and you say, "Next." And you'll see every intermediate pop up.

So, the next one here would be like this, where the nitrogen has moved over and now it's connected to this carbon, and you have a carbinol intermediate. And you can actually see the entire shift. And these are actually chemically realistic, so they minimize so that that's really what they probably do look like. And it looks very different than it does on a piece of paper, right? Where everything's flat. So, I think this is a pretty good exercise. And there's every step. So, there's a couple of intermediates in that amine formation, and if you keep saying, "Next," it will keep going.

Now, as we've been putting molecules on Second Life we've been worried about how people are going to find them. And one of the things that I set up is a wiki: secondlifemolecules.wikispaces.com. And here we basically just put the scriptors, we put the uses, and we put the slurls. The slurl is a link that when you click on it, it will take you to Second Life to that specific location. So, that's a way for Google to index it.

We've got three Periodic Table, again working with Andy Lang for the ACS, we actually did this. You can have faculty offices. This is sort of what my office looks like on Drexel Island. You can put whatever you want there. Here's an example of my lab. So, these are just pictures of my students, there's pictures of our chemicals, you have pictures of the equipment. There are all kinds of things that you can do to explain what you're doing.

But, the one thing that works extremely well on Second Life is posters-and you're going to see some of that at the Sci-Mix Session tonight, if you come. And because you can do some really nice things - you can take your PowerPoint and you can just dump it into this viewer in Second Life. And when you click on it, it changes slides. So, from the standpoint of a learning curve, there's really not much to it. You just give your PowerPoint, boom, you're done. Right? And then you could put these little bells where when you click on it, it will summon the presenter. So, that's what we're going to do tonight at Sci-Mix. We're going to make sure that all the presenters have their little bells working.

We've had conferences. Here's an example of a conference that I organized. The SciFoo Lives On Conference on Nature's Island. And you can see there's lots of people hanging around talking to each other, and we're using the poster boards to give the presentations.

This is ACS Island, if you're not familiar. It actually has the shape of the ACS logo. So, this is the land, and the water here gives it the outline. So, if you see lots of moats and you follow them, that's the reason. It's to have this look.

So, there are just a few screenshots of ACS Island. Here's the headquarters. You can go inside here and get some freebees. Get some lab coats, goggles, whatever. There's an ACS Landmarks. I wasn't familiar with the Landmarks before I started this project, but every year or every couple of years ACS gives these Landmarks, like the foundation of the chemical abstracts, or the discovery of helium, things like that. So, these are pretty neat, you can visit this, click on it, and it will take you to the website.

And here's what the virtual poster area looks like. So, tonight at 8:00 you can come around. So, the posters look pretty familiar, and we've put some molecules here that represent most of the posters. So, for example, the forensics lab talks about methamphetamines, so guess what? There's a molecule there to attract people to come over and look at it.

[laughter]

There's also nanotubes, and Cass actually has a really interesting presentation. Hopefully, you'll be able to attend. This little molecule here I put is called felicene. It actually looks like a cat, and we put in some eyes and some whiskers so you can see it here, but the ears are actually on the top. So, again, very, very interesting what you can do in Second Life.

ACS Island has a resident chemist program where scientists can put up some of their work. In our [indecipherable] lab at Cornell is there. The Rosania Lab-that's Rosania from Michigan-is there. And he basically, put up images from his microscope. So he uses that to have meetings with his students and talk about scientific data. And there's also a geodesic dome, so when ACS does run parties, that's where it will happen. And that's a good place to meet. I don't know if anything tonight is going to happen there, but you can check it out.

And so basically, that's it. I mean, if you want to use wikis, it's a pretty good way to organize content and to interact with students. I didn't have time to go through that, but because of the versioning, you can actually tell students to do things on the wiki and then when they correct it they can remove you comment, but it's all in the versions. So that's a good way to interact. And you can use Second Life to stretch student and teacher imagination.

So, what I'm showing you is just what I thought was interesting and cool to do. There's many things we can do in chemistry and if you have any ideas, just let me know and I can tell you if it's realistic or not. And I guess the bottom line is really, we're not trying to replace one means by another, it's just additional channels. We're communicating the same basic chemistry; it's just, to make it interesting and to get students engaged. That's really the point of it.

And that's it. Thank you.

Labels: ,

Thursday, April 10, 2008

Albright Talk on Educational Technology

Transcript of Albright Talk on Educational Technology.

Jean Claude-Bradley: All right. Thanks very much for having me. What I'd like to do today is to give you sort of my journey with trying to use technology to better my teaching. I teach Organic Chemistry at Drexel and so a lot of the things that I'll be talking about are Chemistry-specific. I'll be showing molecules and reactions and things like that. But you know want to think in terms of your own fields, how that can map especially when I talk about things in Second Life. So those are just specific examples but this will be completely general.

So again, I'm going to focus on Chemistry here. It's not really about the technology, it's really about the teaching and what we're trying to do as teachers. We're not trying to do anything else except produce people who are going to be competent in their field. I think people lose that a little bit when they start to think it's about the technology, but it really isn't. If you make it above the technology, then it's not going to work because people are not going to be using it for actually doing what they were hired to do.

So if we focus on that and keep remembering that, we can keep adopting as new technologies come on board. You have to think about--again, from the Chemistry perspective, if I'm trying to make chemists, what does it actually mean? I think that it really boils down to having somebody who's chemically literate at the end of the undergraduate level and someone who's competent to create new useful chemical knowledge at the graduate level. I'll be showing you tools that can operate at both those levels and they might not necessarily be exactly the same tools so I'll show you that by example.

As teachers, how do we actually do this? How do we make chemists in the case of Chemistry? We have to do three things pretty well. We have to select and do our contents. That's our choice. We have to assess and validate the skills and knowledge of the students. So we can give content all we want but if we don't evaluate the students, we don't know if they've learned it or not. The third thing is we can actually operate by catalyzing the learning process, which is something that most teachers don't have time to do because they're very busy trying to deliver the content and trying to assess their students. That for me is probably one of the most important advantages of this new technology is that if you use it in the right way, you can generate this time that you can use in a way that you couldn't previously.

So I'll be talking about specifically Organic Chemistry courses, sort of a journey, like I said. I'm going to be starting with--this is what I started with in '96 when I came to Drexel. Traditional lecture-based course, of course, done face-to-face, all the grading is completely manual. There was really--sure, I had office hours, we have all office hours and we use email but that's just really not enough time for doing a lot of the in-depth work that I am now able to do with my students but this is our starting point.

So I'll be going through each one of those three. The first one is assessment. What actually happen over the course of time? As I just mentioned, initially we had manual grading and then I started to use WebCT, so I think you guys here use Blackboard which is they're not merged so it's basically the same concept. Any of you here uses the Quizzes feature? A few people.

Basically, my classes start to get larger and larger and it seemed to be something that was worth investigating, it isn't investment in time. Some of the things I'll be talking to you about are quick fixes or quick to implement. This is not necessarily one of them because you do have to set up your question base and you have to learn how to use the system but I was very happy with the kind of things that I was getting out of the system. First time that I implemented them, it was not grading the students, I was just experimenting with the system. When I saw that yes, this could be used for grading that I moved on to do all my tests and exams with automated grading. I use the video surveillance system that's already at Drexel, it's already built under the public safety. So that's an option depending on what level of security you want, that's something that you might consider.

I started to get in to using blogs and wikis for different reasons, but I saw the potential for teaching. So I will give you some screenshots and some examples of the types of things that you can do with that. By then, I got involved with gaming and so I used Unreal Tournament which is a First Person Shooter game, kind of like Halo, that kind of style. You'll see the evolution of that whether or not weapons make a difference and all that kind of thing and at the end, I actually settled on Second Life. So I actually don't use the First Person Shooter game anymore because I think Second Life is a lot richer and it has more potential, actually.

OK. The other component - the content delivery, again, evolution over about 10 years. So first, of course, face-to-face with paper handouts and then face-to-face with online PDF. A lot of people use the Blackboard, the WebCT systems initially to upload their syllabi and that's a no brainer. It's very, very simple to do, it's something that all the students can find very easily, it's very little work, so that's sort of a no brainer.

Next here, I started to record my lectures. I'm using Camtasia, but I hear some of you guys are using CamStudio. Anybody here using the CamStudio? So George was telling me of his use of that. It will record whatever is on the screen, so I'm actually recording the stock now. Then if you wanted to review it, you can give a link to somebody and it will be a pretty similar experience. But these were optional in the sense that I'm still giving both the lecture in class but I'm recording it and then making it available to the students and there's no penalty for them not following it.

A little bit later, actually, this whole podcasting phenomenon started to get going and, actually, a student asked me, he had a slow connection at home and he said you have just the audio of the class. I hadn't considered it before and I started to look into it. I learned the tools for how to podcast and, basically, it was helpful. But again, I didn't replace the screencast recording, because it was just another channel, and also providing PDFs over iTunes and things like that.

Then I started to get more heavily involved using blogs to deliver content. All right, and I'll definitely show a lot of examples of that. Video podcasting through iTunes, and I did start near the end to use a wiki and you'll see that it's a completely different function than a blog. The wiki is really to organize things as to make it easy to find and it's a very, very quick for anyone to learn how to do it. So that's one of those quick and dirty things that actually does a work.

I will then talk about my use of YouTube and Google Video, these also turned out to be very, very handy for certain types of problems. YouTube, for example, you can't really upload anything more than 10 minutes so you wouldn't necessarily put a whole lecture on there but you could put quick solutions to problems and I find that to be very useful. And because it's so public, you get contacts from people around the world to other Organic Chemistry students who either like what you did or point out a problem. I think that's extremely useful as a teacher.

I won't talk too much about Google Co-op, but I'll show you that there's a search bar. Google is, of course, a wonderful company, and there's a lot of things that they enable you to do for free. One of these things is called Google Co-op, where you can actually specify a whole list of authorized links for your class. If there's an online textbook you like, if you want to use Wikipedia you can include it; if you don't want to use Wikipedia you can exclude it.

So when a student types their search term, it won't search the whole Internet. It will just search the resources that you've approved. That's a really nice little trick. Again, nice quick and dirty thing you can do. I'll show you right now, I'm not using podcasting in the way that I did initially, because now I have archives. It turns out that if you already have a full archive, it's a lot easier to just give it in a whol.ZIP file for the student than showing them how to use iTunes. Showing that.

You notice here that I went from recorded lectures, making it optional, and then eventually I completely replaced the live lecture. For one of the terms I actually studied the attendance of the students. Again, this is video, so it's not just audio podcasts. This is the whole experience, everything goes on the screen, all of my recordings. The attendance drops until the end of the term and goes to 10-20% in both classes. This gave me a great opportunity to evaluate the performance of students who were actually still coming to class versus those that weren't.

The performance was identical. I realized that I was wasting my time repeating myself term after term, and from that point on I assign the recorded lectures in the same way that I would assign a book chapter, and I would do workshops with my students. They'll come in and we'll work on specific problems. I have time to work with them one-on-one, we can do all kinds of things, we can get into Second Life. Basically, that frees up the class time for me to be more involved with my students.

This is what we're talking about here. This whole Learning Catalysis concept, which I think is absolutely critical as a teacher. You have to have time to do this kind of stuff. It's not enough just to put the content out there and to test the students. You don't, honestly, need a teacher to do that. The University could fire all the teachers if that's all it came down to. Teaching is about interacting with students and its about taking your knowledge base; evaluating the knowledge of the student and trying to customize your interaction with them.

Of course it requires the participation of the student, and that's their prerogative. If they don't want to attend those sessions, that's OK, but they're not going to get as much out of the class as the students that do come. It's just a teaching philosophy that I have that I think works. We're almost up to the screenshots, I want to show you how this is all laid out. I do still use a course management system, we use WebCT. It's not the center of the class.

The center of the class is actually the wiki, and this is public. I try to make as much as possible, all of my class material public. The assessment of course is not. Well actually a part of it is, there's a demo account where you can take my quizzes. But in terms of the tests that my students take you have to be registered in the class so there's a whole security thing.

But otherwise, the wiki is a central thing. We've got the content as available as possible. I have the interaction of my students. I'll show you some of that. I already talked about the workshops. I have students doing blog assignments sometimes, wiki assignments, and I can show you examples of that; but everything starts at the wiki. This is a little slide; don't take it too seriously when I say the unhappy old person here. All of the tools have their uses, but it is kind of nice when you discover these new tools.

Instead of pushing on your students, emailing them or contacting them in some direct way like with a phone, most of these new technologies rely on a pool system. You put yourself out there, and the receiver is actually the one that goes and subscribes to your stuff. You can't force anybody to read your material. Yeah, it's great in many ways in that you can get more subscribers. It's not as forceful. Some of that is a disadvantage. You do have to assume that the students are going to be subscribed to your stuff.

It's not necessarily happy and bad, but I think once you've experience both you can then allocate which tool you want to use for which purpose. All right, so these are the screen shots. Again, this is for Chemistry, but they can apply to any field. This is the website that the students get when they're first introduced to my course. Here is the Google co-op search. It's the first thing on the wiki page. If they want to search for a concept in the class, they can put their search term in there. That will search all of the online textbooks that I mentioned, but it will also search all of the transcripts of my lectures. It'll search any content that I have that is in text form. This is actually a really convenient tool.

I record the first lecture. I put it in Flash, I put it in MP3, whatever; and if a student misses the first lecture, which is the only live lecture of the class, then I can point them here. That's actually pretty handy. Really, it's just a little explanation of the class.

On the bottom of this front page are links to all of the resources of the class. I'm not going to go through all of these today, I just took screenshots because I don't rely on the Web. You never know when it's not actually going to work when you need it. Of course, all of these are what they say.

The idea of the wiki is that anyone that has permission can go in on it, can hit an edit button, and can add a link or can add content. So this didn't develop by anyone writing HTML or anything like that. It didn't develop by any plan, actually. It's just, "Oh, I have to give my syllabus, I'll put it here." As you keep getting the same questions I should probably have a FAQ page, so you add it there. The wiki develops.

Again if you remember this "push-pull" system, anyone in the world can subscribe to this wiki for changes. You can either get it through a blog reader or through email with wiki spaces. That's also nice, which you don't get with a regular web page. That's just part of the benefits.

These things are new here. I've just found if you already have an archive of lectures, there's no point in messing around with podcasting. Just give the whole thing; it's about a gig and a half; and then they're done for the term. They can access everything.
Student: That's with the audio? That can't be video, can it? It can?
Jean-Claude: Yeah, m4v files. So the m4v files will play on their video iPods as well. Yeah it's not that big actually if you put the settings right.
Student: You're just capturing the screen.
Jean-Claude: Yeah I'm just capturing the screen. I think they're like 50 megs per hour, it's really quite reasonable. And at the bottom here is a site meter. And what this does basically is it's a free service that sees how people are finding your site. So if they're using Google with certain search terms you can look at that. And it's a thing that I like to put on all my websites, whether they be blogs or wikis. First of all it's free, secondly it's pretty interesting, actually, you can make that stuff public. So people should know what kind website it is. And you can get a really good idea by looking at recent hits.

OK, so the blog. So the first time that I taught this class and recorded it I used the blog, which made a lot of sense because, with the blog it's chronological. So you have a class and then you go and blog about it. So you write some things, you upload your files, you can link to the PDF, you can link to the flash recording, you can link to the audio.

And there's a way of doing this with blogger that enables you to do podcasting really easy. So if some of you want to talk to me about that afterwards I'd be happy to discuss it. So what happens if this isn't a live class, it makes a lot of sense because you get new posts after every class. But once the class is all recorded, and I'm not going to be redoing that class live anymore, the blog, it's not necessarily the best way to contain the information because it's not live anymore. So this one, it hasn't changed since 2005, but it still already has all of the text for all of my classes so I still use it. But if I were to redo this I would probably put this in a wiki because it's a static document basically at this point.

Do any of you have blogs, maintain blogs? OK. So with a blog again it's just, it's a web page that has chronological postings. So if you want to follow it you can use different feed readers. I used to use Bloglines I now use the Google reader. I think it's pretty good. It comes with your Gmail account. And whenever there is a new post, it will show you that you have so many posts. So if you're following a lot of blogs at the same time, it's better to have a reader because there's just one place to go. If you're only following one blog then it makes no sense to have a reader you might as well just go to the website. Right, so a reader would be to manage multiple sites. So again I used to teach my students how to do this, but the reality is now that since I don't have any new posts, there's no point. I just give them the zip archive.

Of course, one advantage of having it in a podcast format is you can put your stuff on iTunes. So that people can find you like that, they just click on subscribe. And then you know this will sync up with your video iPod. iTunes can also handle PDFs. So if you want to distribute material like that. It won't play on the video iPod I don't think. But it will certainly play on their computer. And this is just to show what the interface looks like for the student.

So you've got the PDFs show up with a little book next to it and then you have these other recordings. There's none like this here. If it were video then it would have a little television set and if it doesn't have anything then it's just audio by default. Again the video iPod, you would convert your file to an m4v4 format, and that's not very difficult to do. And that will also play using QuickTime, so they don't need a video iPod to look at the m4v4 format, or the mp4, same thing.

So I've used the blog in a number of other ways. And again this is an evolution, so before I learned about wikis I used a blog to record these FAQs. This is the big limitation. If you have a change here, you can certainly change a blog post. But blog posts are not really meant to be constantly changing. They're meant to publish once, then you archive them. And you can access them at any time. But there's always changes right in a class.

So the FAQ now I actually run on the wiki but initially I would just run it like this on a blog. So basically what's nice when you make these things available to the rest of the world you find actually that there's a lot of interest. I know that there's a lot of premed students that contact me about my classes. And when I was first doing this I don't think that there any other organic chemistry video recordings. But now there's a lot -- Berkeley has them -- there's a lot of different places you can get them now. But especially in the beginning, a lot of students are really appreciative to be able to access that, because they may be in a course where they'd like another interpretation of the material. So having the lecture recordings is really useful if you want to have open coursework.

OK, now going to the assessment part, so student blogging. Well it is not mandatory in my class. I have like between 150 to 200 students usually and you don't want a blog with 200 students. Because this is one of these things, it will require more time. You cannot assume that first of all students are familiar at all with the technology, most are not. And so you have to really start at the beginning. So I do this as an extra credit work. I'll offer up like up to 2% of their final grade will be on a Second Life assignment, a blog assignment, or a wiki assignment. And that's worked pretty well for me, where I might get maybe five students out of the class who will really take me up on this and the actual work for it. But this nice for them as well because they have a record of their class that's open to the public and they can use as a link, sort of their e-portfolio.

Now I didn't talk about this at all but in my laboratory I actually have a wiki that's used as the lab notebook. So my undergrad students and my grad students working in my lab will actually post new pages for all the experiments they're doing. And what's nice about that is it's public and I can use that as an example of real active research to show my undergrads what research is actually like.

And so I've been doing that for the past couple years, where these students -- remember this is the introductory organic chemistry -- so they're just learning about this stuff. But as they see concepts they have to try to relate it to something in the lab notebook. And, of course, that's messy because students are posting to it all the time, they're updating. So it is kind of nice to give them that other view, where you know how everything works perfectly in your textbook? Of course real life isn't like that. So you can show them a little bit of that flavor.

An example is this. We use NMRs, which are basically just plots which we use to analyze chemical data. I don't want to talk about chemistry very much, but this is something we did not cover in class, the fact that this line here is split into three. It was something that the student came across, and she talked to me and said "Why doesn't this match what we learned in class?" We had a chance to discuss this, and she was able to put this up as a wiki post.

She was able to show the data and then link back to the actual experiment from the lab. So this I really like a lot. The way that we teach things is always too simplistic and it's nice to show them the exceptions; it's nice to show them how it actually works.

So why a wiki? Well a wiki is very different from a blog in that a blog; you're going to be posting chronologically. So if you make a blog post today, it will be March 17th. For the rest of time you'll be able to link back to this entry. You can change a blog post, but you should really, really try not to. People don't expect them to be changed so they're not really looking for that.

A wiki is exactly the opposite. People expect the wiki to change constantly. So you can have a wiki page that could go through 30 different versions. That's OK because you can always go back in time and see what all the previous versions were. Here, for example, is a page of one of these wiki assignments that my students did. You can see every version here. This is my undergrad, this is me, this is my grad student, and you can see everybody's contribution to that page. Unless you're the organizer, you can't really delete the page. If somebody were to come and just put blanks everywhere, you could just revert to a prior version. A wiki is pretty safe in that regard, and it's extremely nice to be able to see all the different versions in a project.

The way these show up -- I use Wiki Spaces, which is a free and hosted service. If any of you are interested in starting a wiki, you can do that 100% for free with this service. This is a nice feature here; when you compare two versions, it will show you in red the stuff that got deleted, and it will show you in green the stuff that got added. You don't have to look very much for the different versions of the pages. If you ask a question of a student, and then there's a change, all you do is look at the two versions. You'll see, in green it will be their new stuff. Sometimes students will go and delete your comments without making any changes, so you can catch that really easily with this feature.

This is the site meter that I told you earlier about today. Not only does it tell you how many visitors you get and what their search terms are, it will also show you where they are coming from. I'm just showing this as an example of the kind of world that we're in today. This is a course that's only taken for credit by students at Drexel. But, yet, there's people from around the world that are somehow finding it and making use of it. I would encourage any of you; if you are interested in sharing more of what you do, if you can; this is definitely a good time in the Internet era to do so.

I want to talk about a few other different things you can do with technology. This is a little game that I did with my students -- more upper level organic, maybe Organic III -- that I call "Wheel of Orgo". The idea is I will use a Tablet PC; you can also use a Smartboard but I prefer the Tablet PC because you can write on it.

I will put the starting material, and I will put the product we're trying to make. We'll go in turn, where the students will put one step. A student might try this, and then they'll write something. If they're correct they get a point, if it's not correct then I can correct them and then they can go get their turn around again.

The idea is to complete. The idea is to either put an arrow going forward here, or put an arrow going to the product. At some point the two will connect and, by definition, you have a chemical synthesis. This is something where, "What's the investment in time here?" It's really nothing. If you already have time to spend with your students, that catalysis time I was telling you about, this is a perfect way to implement the boring stuff that you teach through your content delivery mechanisms.

All right, a little bit more about games. I told you about the first-person shooter game Unreal Tournament. There's actually two different versions of that game. There's the popular version with the weapons, but there's actually a free version that doesn't have any weapons. You can build mazes, castles, whatever you want, and people can walk through them. I basically set up a system where you've got these doors. Say a room has four doors. On it is going to be an image that's either correct or incorrect. So three of these images will somehow be incorrect and one of the images will be correct. The idea is they walk through the door that's correct. If they do walk through the correct one, they go to the next level in the maze. If walk through an incorrect one, they have to start over.

I was able to have a system where I could do races with my students. They would all start at the same time and the first one to make it to the 20th level would win a prize. I would give them a molecular model kit or a book or something like that. [laughter]

This is something else that works with some students. Not everybody wants to do it so I never make this mandatory. I think you run into problems with these things when you start to make them mandatory.

I'm not teaching new material here. This is the same material that's in my lectures, the same material that's in my quizzes. It's just another way for students to do it differently. Now I did actually use the weapons version. The disadvantage with that is that it's a commercial game, so you actually have to figure out how to buy it. We had a couple copies installed in the libraries and things like that. I think if Second Life had not come along I would probably still be doing this.

But Second Life did come along. It's just so much richer in terms of the things that you can do. Second Life is not a game; Second Life is just an environment. Just to orient you here, these are the avatars. This is actually me in this particular role and this is how I view the world. I'll try to go on after; I think I can connect to the Internet and I'll walk you around.

So what I used here are obelisks instead of the mazes in Unreal Tournament. So when you click on the obelisk these four images come up. These are either JPEGs or bitmaps. I didn't need to create any new content. I was able to take all of my content from Unreal Tournament and just port it into Second Life. It's the same deal and I can have many obelisks in an area. Students can all come and they can compete against each other, or they can come any time and practice for the tests. So the only difference here is when you click on an incorrect one, you've got to start over. When you click on a correct one it gives you a new set of questions.

Now there's a lot you can do with Second Life; and again, this is going to be pretty chemistry-specific. Of course molecules are important. Working with Andrew Lang, one of my collaborators, have this thing we call a rezzer. It's an object in Second Life that enables you to do something. This particular rezzer allows you to create molecules. So, you feed it information about the molecule and it will create it in 3D. You can see, this is the size of our avatars, you can make a huge molecule. It's a way of looking at chemistry in a way that's completely impractical in real life. You could buy a model kit that would be gigantic but that's not very convenient or practical. You can do these kinds of things in Second Life.

You can do things like fly around on your molecule. Once you've made it, you can wear it and run around. This is actually on Nature Island. There's a lot of educational areas on Second Life and I'd be happy to give you links to that if you're interested.

This is a Buckyball, if you've heard of these. There is some chemistry there. The really nice thing about Second Life is not so much what you can do there, it's the people you can meet. If you're in an area that all it has is science and molecules, you're likely to meet some interesting people from around the world -- whether they be teachers, or students, you're going to meet people that have some interest in that area. If you go to other areas in Second Life that you may have heard a lot more about, then you'll find other kinds of people. So Second Life is much like the Internet. You need to find the locations where the type of content is of interest. If you hang around there, you will definitely have a good experience and meet people.

3D periodic table. Again, we did this for the American Chemicals Society. You can walk around this periodic table, you can click on these atoms and it will give you information about each element. So many things you can do; again, you've got 3D here. You can use your imagination in ways that you simply can't in a traditional website.

Another thing that works extremely well in Second Life are posters. So if you have a PowerPoint, there are tools that we have. Actually, Drexel has an island called Drexel Island, and there's a store there. There's these boards that you can get for free, if any of you are interested. You basically just export your PowerPoint in the JPEG format, and then just dump it into the board. It will show your PowerPoint. If you click on this it will actually change the slides.

The really nice thing about this is you can see; so this is me and I'm talking to two people. This is Loly's poster actually. You can, either with a voice; talk to them; or you can type so you can chat with them. There's a little bell here that you can summon the presenter. If the presenter's online, the text will be in green, you can IM them and you can invite them over. If they're not online it will actually go to their email account, and if they happen to be online somewhere they can quickly log into Second Life and come talk to you.

So I really like posters for Second Life. I think it's a no-brainer basically because it's easy to do, everybody gets it. There's not very much training with it. You don't need to learn to do all kinds of tricky things.

In fact, I've organized conferences in Second Life. There's the Scifoo conference that Nature and Google and O'Reilly organized last summer. I extended some of the sessions on Nature Island by having a lot of posters and people presenting. We'd have maybe 3-6 speakers. Most of these people have never been on Second Life, yet they were able to present in very quick time. And of course this is the social aspect of it. You meet a lot of people in conferences in Second Life, they're going to be attracted to the content. A lot of this content was about Open Science, so if you had any interest in that you could certainly meet people.

And you can have faculty offices. This is really completely up to you. Here I have a little area with a real picture of me, some molecules. This is actually the chemistry department. It's a little pod area. Again, a lot of these things are very simple. Any images, that's very simple to do in Second Life. I think the big problem is people getting too ambitious with it. You have an idea, and you want to do that particular thing for your class, but it's not a realistic idea.

Now there are people in Second Life who are developers who can definitely help you if you pay them. There's a lot of people who will do it for free. What you need to do is talk to somebody who is already doing it and ask them how realistic your idea is. Most of the time, all people really want to do is they want to put up a presentation or a couple of posters. In Chemistry, of course, I want to put molecules up. If you're in Bio, maybe you have other needs and so that would be something you would discuss with people. I could certainly talk with you guys about that if you had any ideas.

So, in summary, you can use blogs for creating podcasts or for storing static, sequential content. Transcripts also are a very quick and dirty thing. If you buy transcripts, you can quickly create a blog. Just dump them in there, and they will get indexed by Google pretty quickly, and you can link them back to your recordings. That's one of the things that I've done.

You can use wikis to organize contact and interact with students on assignments. Again; very, very useful. There's a versioning system in the wiki were you can actually see all the past actions. You can use Second Life to stretch student/teacher imagination. Again, one of the things that people tend to discount is this networking aspect. I think that's very, very important. If your students can meet a couple of people from around the world, you never know who you're going to meet that can help your student down the road.

Use multiple channels to deliver content. This is not about using one of these tools versus another. What it is, it's about giving multiple channels to your students so that they will find the path that makes the most sense to them. Some of the students will like Second Life. Some of them won't. Some of them will like the WebCT quizzes, others won't. As much as you possibly can, try to give it in different versions. Most of these technologies are simple, free and hosted. So you really don't need anything for most of these.

Second Life you do need a place to build, but I can talk to you about that if you want to experiment. We can probably find a little place for you. Ultimately, if you want to buy an island, then that would be done at the Institution level. I've gone through that proccess, I can also advise. But certainly you don't need to buy an island to get involved with Second Life. So that's it for my talk. Let me try to log on and we'll see-- we have a few more minutes?

Open Notebook Science and Cheminformatics

Transcript of Open Notebook Science and Cheminformatics presented at Indiana University.

Jean-Claude Bradley: ...OK. So thank you very much for the invitation to speak to your Cheminformatics class this morning. I would, basically, like to show in detail how we're using Cheminformatics concepts in my lab. But in order to really show its place, I'm going to give a little bit of an overview of the big picture. In other words, where Open Notebook Science, how its actually done on a high level and how that trickles down to the smallest details of SMILES, InChIs, and all of that.

So, in the very beginning, let me just introduce the concept of Open Notebook Science. You want to think of it as, basically, being at the end of a continuum here. There is recently been a lot of effort done to try to make Science in research and teaching--although we're talking mainly of our research here--more open. If we look at the traditional lab notebook, typically, it's an unpublished document. The only person who has access to it is typically the student and their supervisors, of course. But, when they leave the lab, all of that information is typically not readily accessible to other people. You may have other people around the world repeating their experiments when they could very well make use of that information.

Of course, a step above that is the traditional journal article. Here, this is the typical article format where you have your Introduction, you have your results, your discussions, and it's all one little story. That's great in that it helps communicate some science but a lot of what's missing in there is often all the failed experiments, a lot of the ambiguous results that are probably more common than not in a typical or organic chemistry lab. That's more open, but it's still not completely open because not everyone in the world can access it. Usually, you have to get a subscription to the journal.

So there's a new wave coming out to make journal articles open access. There, typically, the fee is paid by the author although not all of these, but those articles are available for free to anyone in the world. But again here, the format is the same as the traditional journal article, so a lot of the failed experiments and things like that are not included. What we're doing in my lab is what I call "Open Notebook Science," which is where we're trying to achieve full transparency so the actual lab notebook is a public document and it's on a wiki, actually.

Now, to show you how all this connects together, we have various vehicles in the lab. We have wikis, blogs, mailing lists, things like that. They all have their place. Usually, when I get this talk, I like to start at the blog level because that's really the public interface. That's where things get discussed that most people--scientists or possibly people who are not scientists--can actually figure out a lot of what's going on. So I'm going to give you a few examples of what goes on there and I'm going to show how that's connected to our laboratory notebook.

Some of the things that are discussed on my UsefulChem blog are funding. Again, this is something that's not typically public, but I think there's a tremendous advantage to making these things public. One of the things is you can find new collaborators. You can have people try to understand what's going on in their field, what people are trying to do. I think that, overall, even though there's some hesitations about scooping--and we can discuss that later--I think, overall, it's very positive to make things as public as possible.

The other thing on my blog that I've done recently--this is just examples from the past couple of months--supporting funding initiatives. My friend, Cameron Neylon at Southampton University, he was writing a proposal to help people travel to an Open Science talk. He put out a request out there for support and with my blog, I was able to do that in a small way. Whenever we had media coverage--I like to report on that-high light peer-reviewed coverage. Lab notebooks that I'll be showing you are not the traditional peer-reviewed publication.

It doesn't mean that we're not interested in that> In fact, we're very much interested in publishing our work using traditional channels. But, if you want to be able to cite these things, people don't necessarily believe that you can do that, that you can actually cite lab notebook page. So I always like to point out specifically when people in peer-reviewed articles, actually, cite our work. So that's just one example of that happening in the past few months.

I like very much to announce new collaborators. Rajarshi, of course, has been a long time collaborator, but in the past few months, we've had some recent one's. Gus Rosania, University of Michigan, talked a little bit more about what he's doing drug transport and collaborating with us. Matthias Zeller, he is an expert crystallographer, who's actually done crystal structures for our compounds, which Rajarshi and I was talking about a few minutes ago. So he's the guy, actually, who's responsible for that.

Again, another collaborator who is very generous with his time to be able to provide us this information. We have another collaborator, at University of California in San Francisco - Phil Rosenthal. He's the person who has been testing our compounds for anti-malarial activity.

The other thing I like to talk about are presentations, so what I'm doing here, I'll most likely blog about and link to the recording. Presentations in Second Life, so you'll see a few slides here about Second Life, which is a virtual world where people exist in the form of avatars. This would be me, for example, and these were all the people at the meeting, then we can interact this way.

We can have presentations here and these are just basically PowerPoint slides in Second Life, so that's something else that I discuss. I can also discuss Science in new media, so here's an example of a protein in Second Life. I can talk about how I use it in teaching, so here's a student of mine flying around on a camphor molecule and here's a buckyball.

These are all things are related somehow to the work that we're doing in my lab. As you know, we're doing malaria work, so we're trying to make anti-malarial compounds. Part of the advantage of having a blog is once people know that they can ask you for support for other related initiatives, here is a "Run for Malaria" in Philly and it's to collect money for nets in Africa. A perfect example of people who would be following our work here, there's a good overlap of people who might be interested in participating in things like this.

Finally, I talked about more general science philosophies, so not necessarily just organic chemistry or the services or anti-malarial compounds but a lot of the fundamental issues about how science gets done and the opportunities of Web 2.0 technologies to facilitate that.

OK. So those are just really quickly some examples of things that I discuss. The one thing that I didn't show in those blog posts is a lot of them, actually, have links to back up some of the statements that I'm making. I'll be linking to specific experiments on our laboratory wiki. Basically, here I make an announcement that the falcipain-2, which is an enzyme discovered by Phil Rosenthal that degrades hemoglobin. So it's an enzyme that belongs to the malarial parasite. We just talked about that we just shift a couple of compounds.

Now, there's a link here, it says "See Experiment 150." So this is where the beef is. Basically, there's no reason for you to believe anything that I'm saying, and you really shouldn't if you're really applying the scientific criteria. I'm going to be linking to the original data and you can make up your own mind as to whether or not what I'm saying is reasonable.

So if we click on this link, we'll end up on the Lab Notebook page, which is on a wiki, so this is an example of a Ugi reaction. We're just mixing four components together and we're getting a precipitate, which is this Ugi product.

There's no need to really go into the chemistry here, just to say that we're using these compounds so we want to index them in some way and I'll get into that. We're making this compound and if we're trying to find which experiment was this compound made, there are ways of, basically, finding that using some cheminformatics tools. But right now, I'm just going to break down this Lab Notebook page and show you how you can gain access to all of the raw data.

The first part, this is actually a pretty long page, so I'm going to take it section by section. The first part is, of course, the Objective, and we're trying to make this particular compound. I'm going to click on this on the next slide and I'll show you that it's going to link to an entry in ChemSpider, which hopefully you're familiar with. It's a pretty large database of chemicals, I think, it has almost 20 million compounds now. We've been using ChemSpider to archive our results to a large extent.

The other thing you'll notice on the Objective here is that it has links to all kinds of things including a Summary Post. The reason that I really want to make sure that there's some sort of link like this in every experiment page is, if you're Googling and you find this lab note page, I think it's very important for people to be able to know what the bigger picture is.

If you read this, and OK, you do understand the chemistry but you don't understand why we're doing it, you would click on the Summary post. This will take you, typically, to a blog post that explains, in a lot more detail, at a much higher level what--first of all--anti-malarial compound are trying to make. It will also explain the reason why we're attacking falcipain-2 as target enzyme and the docking results that Rajarshi ran for us. All of that, basically, is traceable and linkable from here.

So I'm now going to click on this Ugi Add It link to show you what the ChemSpider looks like. So ChemSpider has a bunch of pages and here they have the molecule. There's a little bit of a rendering problem here, you all know this bond, it looks kind of weird. That's actually been resolved for the most part. ChemSpider has been around for about a year and they're in constant development. One of the things is that if you're working with new technology, sometimes these things will happen.

But the people there are extremely responsive. That's the advantage that a lot of things that we want to do, they can actually implement for us. Whereas, if we're a more traditional kind of archive, it might be more difficult for them to customize what we want to do. But this service, actually provides a lot of useful stuff. They provide the SMILES, they provide the InChI, and they provide the InChIKey automatically.

So I'll be using these for various purposes. The SMILES, of course, is pretty handy for searching in online databases. If you can search by any way, usually, the SMILES is going to be always included. The InChI is starting more and more to be included on online databases but it's not always the case. The advantage of the InChI is that there should be only one unique InChI per molecule. Whereas with the SMILES, oftentimes, there are multiple SMILES for the same molecule, so that's a big advantage of using InChI.

One of the disadvantages of InChI is that for large molecules--and this would be considered actually, a pretty large molecule for InChI. The InChI is so long that it doesn't properly get indexed by search engines like Google. So we've started to use a lot more the InChIKey, which is basically, just look up table from the InChIs. The advantage of the InChiKey is it's very short, and it's just a bunch of letters that should not recur accidentally very easily. So if you actually do a Google search for the InChiKey, you're pretty likely to find what you're looking for and your not going to get a lot of junk, in addition.

InChiKey here has two components and that's also a pretty useful thing. One of the components, the first part here, actually, tells you the connectivity of the atoms. So if you're not interested--like here we have a trans-double bond, so if I didn't care for the cis or trans, I could just search for the first part of the InChiKey and it would pull up all cis and trans, all the various stereoisomers. But if I wanted to specify this particular isomer, there is this additional information in the InChiKey that does that specification. So this is handy as well, this way of making InChiKeys.

OK. So the other thing that we can link to is the experimental plan. Before the students start the experiment, they typically, are going to follow some sort of plan. This is something that you can look up or anybody can look up. We also want to link to the docking procedure. You may not be a synthetic organic chemist; you may, actually, be more from the docking side.

If you want to see exactly what was done--Rajarshi, in this case, actually, did this run for us--and so here's the library, if you wanted to see the list of all the compounds that he used to dock, you can see that. Here's a link to falcipain-2 and it explains more about this enzyme.

The procedure--this basically Rajarshi wrote--and he's explaining how he used the PDB file and he's explaining the two docking sites on that particular enzyme that he thought would be a good starting point and here are the actual results. So these are the list of compounds that we're trying to make. If you want to see what they look like, you click on one of these links and you'll see that these are just tables of SMILES.

Once again, SMILES is a pretty convenient format to store lists of molecules and that's what we've done in this particular case. Someone could actually, if they were interested in all of the first 1500 hits; they could easily come here and copy and paste.

We also have a procedure section. This is supposed to look like more the kind of thing that you'll find in a traditional journal article. These are written in such a way to make it easy for us to send out our papers. We just have to basically copy and paste these sections in the "experimental" section. In order to actually verify the observations and conclusions of the InChI experiment, we provide all of the raw data for the spectrum. In this particular case, again, we're looking at the same Experiment 150.

A compound was isolated, and the proton NMR and the carbon NMR were uploaded. They're uploaded using JCAMP-DX format. This is a pretty convenient format for all kinds of spectroscopy. This is NMR; you could have carbon NMR, proton NMR, IR, Mass Spec; all of these different experiments, equipment can usually save the results as JCAMP-DX format. The advantage in that is that there are free open-source viewers such as JSpecView that will enable you to view the data in an interactive format.

So if you were to come onto the website and click this proton NMR link, it would pop this up. If you wanted to zoom into any region, all you have to do is drag with your mouse across and it will expand all of these little peaks. If you're familiar with NMR, you'll know that that's actually really critical. That's the only way to get J-Constants, for example. Also, it can reveal a lot about the phasing. On this unexpanded view I can see peaks but it is unclear what the quality of the peaks are.

When I expand them, I can see that this is a triplet, but there's some phasing issue going on. It's a way of looking for the details of impurities and things like that. In the supplementary section of most journals, they don't typically give you all of the expansions for the NMRs. They will give you the expansions that the researcher wants you to see, but sometimes that can be a little misleading. That's why I'm a big fan of using JSpecView. This also does not require the person viewing the information to download any software.

This is just using Java, so they're using a common browser and they can just click on it and view the information right away. I'm a big fan of that. Finally, when you get to the conclusion of the experiment, in this case the UV product was obtaining 59% yield; you really don't have to trust that. You can drill down to any part of this experiment and you can see if you agree with that conclusion, based on all the evidence provided.

That's really what Open Notebook Science is all about. It's about making all of your results public, so that if you're making a statement you can truly back it up with real support. OK, the last section on that page is typically a tags page. Here, we're listing all of the molecules that were used in this experiment. There are three or four formats that we're using right now. The common name. When I say the common name of course, that is the problem with common names.

There's more than one name for a molecule, so the common name is just a handy thing for us to keep track of. What is this entry? Really, what we're interested in is to put the InChI, and to put the InChIKey for each one of the molecules that was used.

These are actually linking to Google, so if you were to click on one of these links, it would give you a Google search for this InChIKey. It would show, most likely, mainly the results from UsefulChem, but it would also show other people who've used this compound and have indexed it with InChIKey. So right now I think this is the best way for tagging molecules. Now we're going to be looking at comparing experiments. I just showed you one experiment there; we've actually done a whole bunch of experiments.

We'd like to have convenient ways of comparing them. There's a few ways of doing that, one is with a simple table. We're using Google Docs, again because it's something that's easy. It's hosted, we can actually make it public very easily. Remember, the point of this is to make things as public, as quickly as possible. I'm a big fan of Google Docs. You can see here that we are keeping track of things, not only with the common name, but also using SMILES.

We've also used InChIKeys in this case as well. InChI not so much, again, because a lot of the InChIs for the larger molecules are just unmanageably long. I will put InChIs for small molecules but I tend to stay away from it for the big ones. The most important section of every experiment is of course the log. If you don't have the log or if the log is incomplete, you really can't do anything with that experiment. We have a log, which is just basically the student recording what they did and what they observed at different times.

You can actually construct all of the results from this. You can construct your discussion, and ultimately, of course, your conclusion. But if this is missing, you really don't have much proof for what you did. I do consider it the most important section. The problem with the log written in this way is that this is written in freeform. I know that Rajarshi and I have talked a lot about trying to automate things a little bit more. If you have a log in a freehand format, that makes it very difficult.

If a machine were to look at this, it might be able to pull a few things out. It would know that benzylamine was used, but it wouldn't know very much more precisely what it actually did. One of the things that we're doing now is converting all of these logs, that are written in a freehand form, and converting it to a machine-readable format. Here's one example of how we're doing that. I'm taking that log and I am now breaking it down into a series of steps in a workflow.

These words here; add, weight, vortex, take picture; all of these are terms that we agreed that we would use to specify certain kinds of actions. If I use the term "vortex" every time I vortex, I'm going to be using this exact way of representing that. These are represented in steps. This actually could be read by a machine fairly easily. When I say "Add Compound," I specify with a common name. This is mainly for human use so we can tell roughly what's going on.

I'm using the InChIKey for the machine to read. The InChIKey, again, I prefer this to the InChI because it's always going to be the same length no matter how large the molecule. It's a really convenient way of keeping things concise. This is one of the ways in which we are trying to automate things, or make them available for people who are interested in automating it.

So we can take these workflows as well and we can represent them in some more tables.Here, what we're doing, just to clarify the difference between the first table I showed you and this one. On the first table, I was looking at entire experiments. Here, I'm actually breaking down each experiment in to each individual result that was obtained.

In Experiment 150, for example, I took pictures multiple times and each one of these is a self-contained result. This is a self-contained result that--let's say the experiment, I drop the flask on the fifth day. That doesn't mean that everything that I did in days 1, 2, 3, and four are not helpful.

This certainly can be helpful but not if I think of them in terms of an experiment. If I drop the flask on the fifth day, the typical thing to do would be to just abort the experiment and move on. But if we're breaking things down into each individual result, we can actually use that information to plan further experiments. The key here is to represent it in such a way that it's systematic and that other people can use it easily.

So just a couple of different organizing ways, there is a table here on one of the wiki pages that is just a table of contents, so the list of all experiments. If you, as a human being, are looking for information and you know the experiment number, this is probably the easiest way to do it. You can also see the person who did the experiment. You can see a brief title, but in order to really see what's going on, you'd have to click in but it is probably the most common way of starting.

Now, I would like to talk briefly about why it is that we're using a wiki in Open Notebook Science. One of the things is wikis, it's very easy to tell what's going on in the lab. So if I click on the recent changes button here, it will tell me in the past few minutes, hours, and days who did something. It'll tell me exactly when and it'll tell me which page was modified. So if I'm interested in either what Emily is doing or if I'm interested in Experiment 158, I would then click on this and then see exactly what the addition was.

So if I do click on one of the experiments, there's a history button on every page of the wiki and I can see every single modification of that page over time. Now, we're using Wiki Spaces to do this which is a free-hosted service. The big advantage of that is that these date-time stamps are third party generated. So if I were running this on my own server, someone could argue, "How do I know that the time is correct?" Since we're using a third party time stamp, I think, it's pretty objective that if I look at anyone of these versions, I'm able to prove that I knew what I knew at this particular time.

So even though things might change, anyone can go back in time and see what exactly we knew and what we talked about. If there are mistakes made, we can see what those mistakes were and how long it took to correct them, all kinds of things like that.

Comparing two pages on a wiki is very simple, you just compare and anything that's new will show up in green and anything that's deleted will show up in red. So if you're making comments and the students are responding, this is actually a really convenient way of doing that and keeping track of it.

I also use on every blog and wiki page this little service called Site Meter. Again, it's another free-hosted service that tells you how people are finding the various pages. So this is something that I check pretty regularly because it tells me how the information that's on our wiki and blogs is actually being used. This is very interesting because you see people, for example, searching for the NMR of butylamine. That's the kind of search that is a little bit harder to do on a traditional journal article.

Whereas if I'm looking for oxidation of catechol or if I'm looking for immine chemistry, these are things where there's a lot of discussion about troubleshooting. There's discussion about things that don't typically show up in a traditional article. This gives me good feedback that we're actually accomplishing what we set out to do in terms of making the information usable.

Another nice thing about having Open Notebook Science is that you can actually tell the story of the failure, something very difficult to do in a traditional journal. We did manage to make this compound, for example, [inaudible] and we ran into a lot of problems. It took actually a long time to do this. Although you can discuss a little bit of this in a traditional article, here, I can go into great detail and talk about exactly why it is that we fail, why it is that it took so long. I can link to the original experiments. Maybe we save somebody sometime if they're trying to do a similar thing.

So we use additional vehicles besides the wiki and the blog, we use a mailing list. I find mailing lists to be very useful for other groups. The wiki is good for my group and I certainly welcome people to contribute to it. But a lot of times, it's easier for people that are different institutions to just hit Reply in their mail. So we use a UsefulChem mailing list for that purpose. We're working out a lot of little details that are not important enough to make it into the blog.

So if I just step back and look at the very big picture here, my group at Drexel--we are synthetic organic chemistry group. In terms of the information flow, so we have Rajarshi doing the docking. We also had, although not recently, Tsu-Soo Tan from Nanyang Institute who's also done some docking work for us.

What we'll do is take that information from Rajarshi and we will then decide what compounds to make. Once we've made the compounds, we then ship them out to either the Phil Rosenthal group if we're doing an anti-malarial test, or we'll shoot them out to NCI, where they've actually done anti-tumor testing on our products.

So once we get feedback from that, we can go back to Rajarshi and we can say, this is working, this is not working, then we can alter the model. This is nice to have so many great people that are willing to donate their time and expertise to actually do something constructive, do a whole loop here. So what I call it "Closing the Science Loop." Once we get information, then we can then start over again and get more information from the docking group and just make more compounds until we make a better and better agents.

A couple of other people that started to collaborate with us, I talked earlier about Gus Rosania. He's actually building red blood cell model in a malarial parasite model, so that we can try to simulate the transport of the various drugs that we're making to see if we can make them better by changing their transport properties.

Other people who are very involved with Open Notebook Science: Cameron Neylon from South Hampton University. He's also recording his experiments in great detail, but he's not using the wiki. He's actually using blogging software that is modified to keep track of versions.

There's not one way to do this, I'm just showing that you can use wikis and blogs to do it this way. There are other people and they have other reasons for doing things in a different way. So it's nice to see different examples like that.

In the next few minutes here, I'd just like to talk about some more detail on the Cheminformatics side of this. What we want to do, by exploring the information about compounds, InChIs, SMILES, whatever; is we want to be able to have machines process it as much as possible.

Here's something that we've done on our server, where we published information about each molecule in the format of SMILES, using a traditional blog. One post would correspond to one molecule on the blog. We're no longer using that because we have too many molecules now, but at first we were doing this. There are a few advantages to this; we can actually read the feed from that blog, and we can then calculate the InChI and we can calculate the molecular weight.

We can look up suppliers; we can make all of that available as separate web pages. We can also show the appearance of the molecule in 3D using JML. All of this stuff here is generated automatically. It only requires people to dump the SMILES in a blog post. OK, so that's one example. The other thing that we can do, is we can take that same feed and we can convert it to a CML RSS feed. We're not doing a lot of this, but we did it just to demonstrate that it was possible.

We can take that CMLRSS feed and we can read it in readers like BioClips. Very briefly, BioClips enables you to read CML RSS and then each entry here would be a different molecule; you can associate spectra and things like that. If anyone as more interest in that, feel free to contact me. There are people that are still working on this. This hasn't made it to part of our routine workflow, but it is something that we have experimented with.

Now I talked a little bit earlier about looking at spectral information using JSpecView and the JCAMP format. If you remember, the reason that this is really nice is you can expand any peak. This is showing on the browser how it looks, and that I can expand the peak. Now if we want machines to be able to make use of this JCAMP format, we can actually use Excel DBA. If I have Excel DBA reading the information from the JCAMP files, I can do something pretty interesting.

I can specify different regions in Excel as to the peak locations, and I can attribute each one of those peak locations with what I think is the corresponding chemical group. If I run that, it will read each individual spectrum and it will take out all the XY data and all the metadata. It will automatically give me a reaction profile where I can determine kinetics from. This is neat because all I'm starting with here is.

I'm monitoring an experiment using NMR. I'm dumping those NMRs in JCAMP format, in a folder. I specify the start time of the experiment, I click on the Excel DDA and it spits out this reaction profile.

We don't do reaction profiling much lately, but about a year ago we did a lot of this. This is just another example of how you can leverage automation if you have the broad data in a usable format. A last thing here, what's the next step with what we're doing?

I showed you a little bit about Second Life, how we can actually create molecules. We have a rezzer in Second Life that Andrew Lang built for us. All that means is it's a little bit of software where instead of figuring out how to connect all these bonds, you dump either the SMILES or the InChI or the InChIKey in the chat box.

It will create a 3D version of the molecule in Second Life. This is using some of the scripts that Rajarshi wrote. It's hitting ChemSpider, web services for converting InChIKeys into InChIs. It's using all kinds of different things, and it is very conveniently giving you the molecule. One of the advantages of these SMILES and InChIs and all this, if you can minimize how much the user has to worry about transforming things; you can get people who are not familiar with Cheminformatics to do some pretty powerful things.

This is an example where I have my students do assignments in Second Life. All they have to do is figure out how to grab that SMILES or InChI and they can do this stuff. In the background here, this particular student assignment is looking at acetylphenone. He is trying to explain how the spectrum supports the assignment of the molecule. Now, it's a little fuzzy here in the back, but this is basically a screenshot from JSpecView. What we're currently working on with Andy is we want to make this interactive.

We want to be able to read JCAMP files directly into Second Life, and to have an interactive way to do expansions. This is something that I think is going to be really neat to meet up with students, and we can talk about the molecule with is here in 3D. We can interact with the spectrum and we can try to figure out all of the different regions of the spectrum; whether or not they support the assignment. In order to do that, we need something better than this, which is just an image. We need to interact with the spectrum.

The bottom line with all of this is these Cheminformatics tools that we've been talking about; yes they're great for communicating between human beings, but I think ultimately where the real power is going to be is to go from human-human interaction to human-machine interaction; and eventually to machine-machine interaction over the free web, to design experiments, execute experiments and analyze them.

A big motivation for us, at least, is to start to use these Cheminformatics tools in a way that is easy for people who do like to do coding for them to be able to write programs that will analyze what we've done in the lab. Without having to interact with human beings to figure that out. That's the much bigger picture, and that's where we're headed. That's all I have for today.