Relevant Links
- Book list for my MLIS program to date.
- Information about my program of study
- About me, conveniently located on the front page of my domain.
Previous Posts
- The semester begins
- Looking back at 2008 and forward to 2009
- One down, two-ish to go
- Next Semester - It'll be a crazy ride!
- Happy Thanksgiving!
- [2670] Last reading & muddy point
- Management -- Level up!
- [2670] Muddiest Point & Reading Commentary
- A little essay I cooked up
- Oh noes!
Archives
Subscribe to
Posts [Atom]
Tuesday, November 25, 2008
[2670] Last reading & muddy point
So, this "keeping my [assigned] notes in my real blog (instead of starting a new blog just for that purpose)" thing has been an experiment, and it's not going to happen again. Should I have another instructor who makes the same kind of assignment--keep reading notes and questions about the course in a blog--I'll probably make a new blog just for that. Probably. I mean, it's been good to have my notes up and to be able to see and comment on others' notes; it's a good idea, a useful thing, and something more professors should do. I just kind of think I've bored the few regular readers I have with these commentaries. Separating it out would have been the right way to go.
That said, here are some thoughts on the readings for week 13. I'm glad to have a term for something that's come up so much in discussions this semester: "patent thickets--the fear that some advance will tread on pre-existing patents, of which the innovator may not even be aware." (Joseph E. Stiglitz, Intellectual Property Rights and Wrongs)
Stiglitz made several good points, including that open source shows IP protection is not necessarily a prerequisite for the creation of valuable products. And his point about giving someone a monopoly, even temporarily, leading to a lack of innovation is clearly worth examining. I didn't know about the Wright brothers and Curtis brothers and their forced patent pool, but I'm probably going to bring that up whenever I want to fight about copyright, in the future.
Moving on to Clifford Lynch's "Where Do We Go From Here? The Next Decade for Digital Libraries," I don't know about this idea of the future of libraries being "too important to be left to librarians." That's nonsense. Librarians are precisely the right people to handle this--if you can get ones who are in it for the public good over their own job security, anyway.
I'm interested that he thinks we're out of interesting research topics, with respect to digital libraries. Really?
Muddiest Point: I don't have one this week.
That said, here are some thoughts on the readings for week 13. I'm glad to have a term for something that's come up so much in discussions this semester: "patent thickets--the fear that some advance will tread on pre-existing patents, of which the innovator may not even be aware." (Joseph E. Stiglitz, Intellectual Property Rights and Wrongs)
Stiglitz made several good points, including that open source shows IP protection is not necessarily a prerequisite for the creation of valuable products. And his point about giving someone a monopoly, even temporarily, leading to a lack of innovation is clearly worth examining. I didn't know about the Wright brothers and Curtis brothers and their forced patent pool, but I'm probably going to bring that up whenever I want to fight about copyright, in the future.
Moving on to Clifford Lynch's "Where Do We Go From Here? The Next Decade for Digital Libraries," I don't know about this idea of the future of libraries being "too important to be left to librarians." That's nonsense. Librarians are precisely the right people to handle this--if you can get ones who are in it for the public good over their own job security, anyway.
I'm interested that he thinks we're out of interesting research topics, with respect to digital libraries. Really?
Muddiest Point: I don't have one this week.
Labels: 2670
Friday, November 21, 2008
[2670] Muddiest Point & Reading Commentary
Muddy Point: You didn't seem to believe that having a single, unified interface to the digital libraries of the world would be a good thing. I'm curious about that and wonder if you could talk more about why not.
Reading: I really just skimmed briefly this week, I'll be honest. Which is sad: security has always been something that's interested me. And the problem of access versus ownership is extremely interesting--it's something I feel pretty strongly about. Economics, on the other hand, bores me to tears.
Reading: I really just skimmed briefly this week, I'll be honest. Which is sad: security has always been something that's interested me. And the problem of access versus ownership is extremely interesting--it's something I feel pretty strongly about. Economics, on the other hand, bores me to tears.
Labels: 2670
Saturday, November 15, 2008
Stopping the OCLC Power Grab
I found out about this by way of librarian.net and want to pass it along to anyone who might be interested.
By way of explanation: OCLC, the not-for-profit that provides library services around the world, has gone too far. Originally, it was a library collaborative -- one library could catalog a book, upload it to OCLC, and then other libraries could save time by reusing the catalog information. But as the price of such technology has fallen, its prices have risen. It charges membership fees, record retrieval fees, user support fees, and fees for all sorts of additional services. But now it wants to set the terms of use for every library record ever retrieved through OCLC, so that it can maintain its monopoly in the field. In a very real sense, they're trying to steal our libraries. We have to make them stop -- please join me in signing the petition "Stop the OCLC powergrab!" You can do so right now at http://watchdog.net/c/stop-oclc
For more information, see this wiki page: OCLC Policy Change.
By way of explanation: OCLC, the not-for-profit that provides library services around the world, has gone too far. Originally, it was a library collaborative -- one library could catalog a book, upload it to OCLC, and then other libraries could save time by reusing the catalog information. But as the price of such technology has fallen, its prices have risen. It charges membership fees, record retrieval fees, user support fees, and fees for all sorts of additional services. But now it wants to set the terms of use for every library record ever retrieved through OCLC, so that it can maintain its monopoly in the field. In a very real sense, they're trying to steal our libraries. We have to make them stop -- please join me in signing the petition "Stop the OCLC powergrab!" You can do so right now at http://watchdog.net/c/stop-oclc
For more information, see this wiki page: OCLC Policy Change.
Labels: 2670, libraries, politics, technology
Wednesday, November 12, 2008
[2670] Week 11 Reading & Muddiest Point
Arms is at it again. (It seems like the Digital Library World has between three and five really prolific authors and then a bunch of people who wrote one or two articles.) For the most part, his article rung the "Duh" bell: of course users would prefer to have one interface and see "searching" as a thing they should be able to do from one place, namely their desks. Even I figured this out, and I'm just starting out studying this stuff. It bothers me that individual publishers put "branding" above ease of use, in terms of database subscriptions, and if digital library creators are doing the same thing, then shame on them. OAI-PMH/Z39.50 (yes, I know they're different) is imperfect, and we all clearly must use different metadata standards (I'm only half sarcastic, I guess), but we still need to think a lot more about the larger user community and a lot less about making things "our own."
One more point about Arms, specifically. He says "Google Scholar shows somewhat the same myopic viewpoint." What now? I'm not feigning ignorance, here; I'm honestly unclear what he's talking about.
As far as the topic of this week's readings goes, generally, I know that communities are a big part of the definition of Digital Libraries. I think maybe that gets expanded too far, though. I think there needs to be a community interested in a particular subject, in order to make the work of making the digital library worthwhile, but I also think that you need to keep an eye on the grander picture. Once you've justified building the library, and once you've designed it so that the people whose use justifies its existence can, you know, use it, you then need to make it available to the broader community of potential users. Your individual community might dry up, but the library is already there and potentially usable by millions of people. So do it right.
Muddiest Point: I, too, would support the use of some class time to teach us how to use Greenstone. I figured out how to install it on my computer, but its interface isn't exactly intuitive.
One more point about Arms, specifically. He says "Google Scholar shows somewhat the same myopic viewpoint." What now? I'm not feigning ignorance, here; I'm honestly unclear what he's talking about.
As far as the topic of this week's readings goes, generally, I know that communities are a big part of the definition of Digital Libraries. I think maybe that gets expanded too far, though. I think there needs to be a community interested in a particular subject, in order to make the work of making the digital library worthwhile, but I also think that you need to keep an eye on the grander picture. Once you've justified building the library, and once you've designed it so that the people whose use justifies its existence can, you know, use it, you then need to make it available to the broader community of potential users. Your individual community might dry up, but the library is already there and potentially usable by millions of people. So do it right.
Muddiest Point: I, too, would support the use of some class time to teach us how to use Greenstone. I figured out how to install it on my computer, but its interface isn't exactly intuitive.
Labels: 2670
Monday, November 10, 2008
I still hate Greenstone, but I got it to run
For folks in 2670, I hope this helps. For folks outside of 2670, I recommend against using Greenstone Digital Library software, but if you find that you have to, I hope this helps you install it, at least for local use.
If I find any good tricks for using Greenstone, after it's installed, I'll share those, as well. I hope my classmates will do the same. :)
Assuming the moon's alignment with Venus was not what did it, here's how I got Greenstone to run:
1) I installed VMWare Fusion on my Mac. (This is unnecessary if you already have Windows XP Professional with Service Pack 2. Skip to step 3 in that case.)
2) I installed Windows XP Professional with Service Pack 2. No, XP x64 won't do it; I tried.
(Steps 1 and 2 were necessary because, to be blunt, I refuse to put this buggy software on the native operating system of my only computer. If something goes horribly wrong in my VM, I can just blow it away, without losing important data.) Please note: I did the rest of these steps before installing anything else on my Windows VM, including Firefox or Office or anything; I can't promise that this will work with anything other than IE 6, or in any order other than the one I give below; given how many times I went through failed installs, I've developed a little bit of irrational superstition about the order of operations...
3) I downloaded and installed Java Virtual Machine on my computer.
4) I downloaded and installed Java JDK, specifically JDK 6 Update 10 with Java EE. There's some weirdness with Java's download manager software, but don't worry overmuch. Just do what it tells you to do.
5) I downloaded and installed ActivePerl. I did what it told me to do and was rewarded with cute lizard icons and a properly-defined Perl PATH.
6) I downloaded and installed ImageMagick, specifically ImageMagick-6.4.5-5-Q16-windows-dll.exe. Choose only the default options; don't add more. Seriously.
7) I installed Greenstone, Windows Distribution (Latest). I went with all the defaults, including "Local" rather than "Web."
8) Here's where it gets a little tricky. Perl correctly sets its own path to what Greenstone Librarian Interface expects, but Java does not, for some reason. (I guess first try running Greenstone Librarian Interface. If it works, you don't have to do this step. It didn't work for me, so here's what I did to fix it.)
Go to C:\Program Files\Greenstone\gli, and right-click on gli.bat (it is an icon that looks like a window with a cog in it). Choose Edit. Scroll down to the three lines that look like this:
:findJava
:: ---- Check Java Exists ----
set JAVAPATH=
Now, go find java.exe. On my computer, it's in C:\Program Files\Java\jre6\bin. Paste that path in (yours, not mine, though they are probably the same); on my machine, the above now says
:findJava
:: ---- Check Java Exists ----
set JAVAPATH=C:\Program Files\Java\jre6\bin
Close the file, and click "Yes" when it asks if you want to save. Now Greenstone should run just fine. Ideally.
If I find any good tricks for using Greenstone, after it's installed, I'll share those, as well. I hope my classmates will do the same. :)
Assuming the moon's alignment with Venus was not what did it, here's how I got Greenstone to run:
1) I installed VMWare Fusion on my Mac. (This is unnecessary if you already have Windows XP Professional with Service Pack 2. Skip to step 3 in that case.)
2) I installed Windows XP Professional with Service Pack 2. No, XP x64 won't do it; I tried.
(Steps 1 and 2 were necessary because, to be blunt, I refuse to put this buggy software on the native operating system of my only computer. If something goes horribly wrong in my VM, I can just blow it away, without losing important data.) Please note: I did the rest of these steps before installing anything else on my Windows VM, including Firefox or Office or anything; I can't promise that this will work with anything other than IE 6, or in any order other than the one I give below; given how many times I went through failed installs, I've developed a little bit of irrational superstition about the order of operations...
3) I downloaded and installed Java Virtual Machine on my computer.
4) I downloaded and installed Java JDK, specifically JDK 6 Update 10 with Java EE. There's some weirdness with Java's download manager software, but don't worry overmuch. Just do what it tells you to do.
5) I downloaded and installed ActivePerl. I did what it told me to do and was rewarded with cute lizard icons and a properly-defined Perl PATH.
6) I downloaded and installed ImageMagick, specifically ImageMagick-6.4.5-5-Q16-windows-dll.exe. Choose only the default options; don't add more. Seriously.
7) I installed Greenstone, Windows Distribution (Latest). I went with all the defaults, including "Local" rather than "Web."
8) Here's where it gets a little tricky. Perl correctly sets its own path to what Greenstone Librarian Interface expects, but Java does not, for some reason. (I guess first try running Greenstone Librarian Interface. If it works, you don't have to do this step. It didn't work for me, so here's what I did to fix it.)
Go to C:\Program Files\Greenstone\gli, and right-click on gli.bat (it is an icon that looks like a window with a cog in it). Choose Edit. Scroll down to the three lines that look like this:
:findJava
:: ---- Check Java Exists ----
set JAVAPATH=
Now, go find java.exe. On my computer, it's in C:\Program Files\Java\jre6\bin. Paste that path in (yours, not mine, though they are probably the same); on my machine, the above now says
:findJava
:: ---- Check Java Exists ----
set JAVAPATH=C:\Program Files\Java\jre6\bin
Close the file, and click "Yes" when it asks if you want to save. Now Greenstone should run just fine. Ideally.
Labels: 2670, classes, technology
Tuesday, November 4, 2008
[2670] Week 10 Reading & Muddiest Point
While Kling's paper brings up some good points, its obvious datedness (seriously, Mosaic? information superhighway?) gets distracting. I think it does a better job of showing where we were than where we ought to be going--unless Dr. He's point in giving us this to read is to say "we really haven't made much progress." That would be unfortunate.
Saracevic raises the point that not much is being done, as far as evaluation of digital libraries goes. Or, more to the point, a lot of work is being done on how digital libraries might be evaluated, but the few people who are (or were, as of 2004) doing the evaluation are not really utilizing that work. The field of digital library evaluation sounds kind of scattered and hapless.
As far as the other articles/chapters (which I can't link, either because the link to what we were supposed to read is broken, or because the resources are on Courseweb only), they seem to discuss usability in terms I'm more familiar with (and, I'll be honest, more interested in). They point to websites and operating systems and say "See, this is well-done," and "See, this isn't." As a particularly grumpy web user, I appreciate any effort to improve usability in websites.
Muddiest Point: I didn't have one from last lecture, which I think I've already said. I do want to check and make sure I heard Dr. He right: we get midterm grades in two weeks? (Not trying to rush anyone, just to make sure my hearing's good. :)) Thanks!
Saracevic raises the point that not much is being done, as far as evaluation of digital libraries goes. Or, more to the point, a lot of work is being done on how digital libraries might be evaluated, but the few people who are (or were, as of 2004) doing the evaluation are not really utilizing that work. The field of digital library evaluation sounds kind of scattered and hapless.
As far as the other articles/chapters (which I can't link, either because the link to what we were supposed to read is broken, or because the resources are on Courseweb only), they seem to discuss usability in terms I'm more familiar with (and, I'll be honest, more interested in). They point to websites and operating systems and say "See, this is well-done," and "See, this isn't." As a particularly grumpy web user, I appreciate any effort to improve usability in websites.
Muddiest Point: I didn't have one from last lecture, which I think I've already said. I do want to check and make sure I heard Dr. He right: we get midterm grades in two weeks? (Not trying to rush anyone, just to make sure my hearing's good. :)) Thanks!
Labels: 2670
Thursday, October 23, 2008
[2670] Muddiest Point
I have no muddiest point.
Is anybody dying to meet up and discuss digital library-y goodness, the day before our midterm?
Is anybody dying to meet up and discuss digital library-y goodness, the day before our midterm?
Labels: 2670
Thursday, October 16, 2008
Coral waxes philosophical on Week 8 Readings for 2670 (and Muddiest Point)
I loved the following quote (from our Week 8 Readings):
Google has taught us, quite powerfully, that the user just wants a search box. Arguments as to whether or not this is "best" for the user are moot—it doesn't matter if it's best if nobody uses it. Moreover, as both Google and Amazon have demonstrated, users have a funny way of determining for themselves what is best for them. --Todd Miller
Right on! I mean, I love highbrow, ivory tower discussion as much as the next girl, but what it really needs to come down to is, "How can we engage the user? What will they use?" And, not to get too far off topic, but this is an issue that's really been on my mind a lot lately. You see, before I started seriously considering librarianship as a profession--and admitting this out here in the open is a little weird for me--I didn't go to the public library. At all. (There's this whole thing about how the library in my hometown was my favorite place in the world until I turned 12 or 13, and then suddenly I realized the librarians were looking at me with ... some negative emotion I didn't bother defining, at the time. Having worked in a public library, myself, and having thought about it a bit, I realize it was probably dread. Teenagers are scary, because they're hard to relate to. We remember being teenagers, but we also remember what we thought of adults. You know what I mean?) I dearly loved the library at my university, but I retained my fear of librarians. How sad! I had absorbed that common misperception of librarians as cranky, bespectacled old ladies with book carts and stern expressions, and it didn't occur to me to ask them questions--even the obviously not old, not bespectacled, sometimes not even female librarians at UVA. Then I went to graduate school, where the Engineering & Science Library (where I work now!) was good as a silent study space, between classes, on days when I could deal with the oppressiveness of it all--something the students who wanted silence exuded, not something inherent to the library itself. (That part of it is still a problem for me. I hate walking past the study carrels. Though as time goes on, I become more sure of myself, and I imagine to myself that they realize I have work to do, to keep the library running.) I didn't know the librarians were super friendly and wanted to answer my questions! I wouldn't have dreamed of bothering them! I did all my searches online, in a combination of Google Scholar and IEEE Explore (which, admittedly, did pretty much encompass my research).
This is all a very long-winded lead-up to the question: how do we deal with potential patrons like I was? I was too shy to ask for help. Frankly, I was too shy to venture into the library, except to study. I was intimidated by the catalog and by the shelves upon shelves of books. ... I guess therein lies a lot of the benefit of digital libraries; if shy patrons can find us online, at least they'll have access to some of our resources. But I'd like to address the bigger question, as it relates to brick-and-mortar libraries, at some point in the future. I'll keep thinking on it. Your comments are welcome!
Now to the much more relevant idea of federated search. I am interested in this. I was considering applying to PhD programs and trying to get funding to build a search utility that would go through a library's catalog and all of its databases, because <rant>the current way we do things is so backward and involved and frustrating. Why, after 8 weeks of doing reference for at least a few hours a week, am I still feeling less than confident in my ability to find absolutely everything in our system? That's absurd. There's no excuse for it. Sure, if you know the name of the journal you want to search, I can help you. And I have a passing familiarity with a growing subset of our journal offerings--and the databases that house them--so that I can find certain types of articles pretty well. But why should I have to know what every journal/database contains, in order to help a patron find the answer to a question I understand? [I get why I have to understand their questions.] Why can't I just type something in a search box?</rant> (I realize I'm proposing something that might end up putting some of us out of jobs, if ever implemented well. I think this is a noble goal, really. We're smart people; we'll find something to do. What's important is that information can be retrieved--ideally by everyone--right?)
It seems to me this is what federated search is out to solve (slowly, and with great limitations). I'm a little embarrassed that I thought nobody else had tried to solve this problem, admittedly, but I guess such is the dilemma of a grad student. Better that I'm thinking of solutions, even if they're already implemented (in some form or another) than that I ... don't? Eh.
There are still, clearly, significant hurdles to be overcome in all of this.
The D-Lib article was published in 2004; I wonder what academic libraries have done, since then, to respond to this problem--for those who don't feel like clicking, the problem is a lack of acknowledgment, on the part of academic libraries, of the tremendous amount of academic resources on the Web. My guess: not much. (I love academia, but I acknowledge its imperfections, slowness being a major one.)
Muddiest Point: Does the Greenstone installation on the lab computers do anything besides show us the demo library? Can we build libraries and burn them to CD at the lab? (This is of great importance, since Greenstone isn't installing properly on Dreamhost, and I have a Mac. Also, an unwillingness to install Apache on my Mac.)
Google has taught us, quite powerfully, that the user just wants a search box. Arguments as to whether or not this is "best" for the user are moot—it doesn't matter if it's best if nobody uses it. Moreover, as both Google and Amazon have demonstrated, users have a funny way of determining for themselves what is best for them. --Todd Miller
Right on! I mean, I love highbrow, ivory tower discussion as much as the next girl, but what it really needs to come down to is, "How can we engage the user? What will they use?" And, not to get too far off topic, but this is an issue that's really been on my mind a lot lately. You see, before I started seriously considering librarianship as a profession--and admitting this out here in the open is a little weird for me--I didn't go to the public library. At all. (There's this whole thing about how the library in my hometown was my favorite place in the world until I turned 12 or 13, and then suddenly I realized the librarians were looking at me with ... some negative emotion I didn't bother defining, at the time. Having worked in a public library, myself, and having thought about it a bit, I realize it was probably dread. Teenagers are scary, because they're hard to relate to. We remember being teenagers, but we also remember what we thought of adults. You know what I mean?) I dearly loved the library at my university, but I retained my fear of librarians. How sad! I had absorbed that common misperception of librarians as cranky, bespectacled old ladies with book carts and stern expressions, and it didn't occur to me to ask them questions--even the obviously not old, not bespectacled, sometimes not even female librarians at UVA. Then I went to graduate school, where the Engineering & Science Library (where I work now!) was good as a silent study space, between classes, on days when I could deal with the oppressiveness of it all--something the students who wanted silence exuded, not something inherent to the library itself. (That part of it is still a problem for me. I hate walking past the study carrels. Though as time goes on, I become more sure of myself, and I imagine to myself that they realize I have work to do, to keep the library running.) I didn't know the librarians were super friendly and wanted to answer my questions! I wouldn't have dreamed of bothering them! I did all my searches online, in a combination of Google Scholar and IEEE Explore (which, admittedly, did pretty much encompass my research).
This is all a very long-winded lead-up to the question: how do we deal with potential patrons like I was? I was too shy to ask for help. Frankly, I was too shy to venture into the library, except to study. I was intimidated by the catalog and by the shelves upon shelves of books. ... I guess therein lies a lot of the benefit of digital libraries; if shy patrons can find us online, at least they'll have access to some of our resources. But I'd like to address the bigger question, as it relates to brick-and-mortar libraries, at some point in the future. I'll keep thinking on it. Your comments are welcome!
Now to the much more relevant idea of federated search. I am interested in this. I was considering applying to PhD programs and trying to get funding to build a search utility that would go through a library's catalog and all of its databases, because <rant>the current way we do things is so backward and involved and frustrating. Why, after 8 weeks of doing reference for at least a few hours a week, am I still feeling less than confident in my ability to find absolutely everything in our system? That's absurd. There's no excuse for it. Sure, if you know the name of the journal you want to search, I can help you. And I have a passing familiarity with a growing subset of our journal offerings--and the databases that house them--so that I can find certain types of articles pretty well. But why should I have to know what every journal/database contains, in order to help a patron find the answer to a question I understand? [I get why I have to understand their questions.] Why can't I just type something in a search box?</rant> (I realize I'm proposing something that might end up putting some of us out of jobs, if ever implemented well. I think this is a noble goal, really. We're smart people; we'll find something to do. What's important is that information can be retrieved--ideally by everyone--right?)
It seems to me this is what federated search is out to solve (slowly, and with great limitations). I'm a little embarrassed that I thought nobody else had tried to solve this problem, admittedly, but I guess such is the dilemma of a grad student. Better that I'm thinking of solutions, even if they're already implemented (in some form or another) than that I ... don't? Eh.
There are still, clearly, significant hurdles to be overcome in all of this.
The D-Lib article was published in 2004; I wonder what academic libraries have done, since then, to respond to this problem--for those who don't feel like clicking, the problem is a lack of acknowledgment, on the part of academic libraries, of the tremendous amount of academic resources on the Web. My guess: not much. (I love academia, but I acknowledge its imperfections, slowness being a major one.)
Muddiest Point: Does the Greenstone installation on the lab computers do anything besides show us the demo library? Can we build libraries and burn them to CD at the lab? (This is of great importance, since Greenstone isn't installing properly on Dreamhost, and I have a Mac. Also, an unwillingness to install Apache on my Mac.)
Labels: 2670, classes, cmu, libraries, technology
Friday, October 10, 2008
[2670] Week ... 7, actually
Muddiest Point: My muddiest point from last week didn't get answered. Since it directly impacts the homework that's due on Monday, I'd be really grateful if somebody could help me out: I'm not sure I understand what an attribute is in XML. We saw a schema and how to set an attribute up in it, but not how it would look in the XML that fit the schema. (Apologies for imprecise terminology.)
Reading Response: I hadn't thought all that hard about how beefy a web crawler would have to be, given the volume of information the Big Three search engines index. Those are some mighty big numbers. And, you know, now that I'm thinking about it, of course they could shut down an entire domain if they did not have politeness algorithms in place. That they could shut down an entire country, I am still wrapping my head around. (To be fair, I still picture them as little spiders that "grab" links and report back information to big servers.)
I didn't see any mention of ill-mannered search engines, who ignore robots.txt files; I'm curious how search engines other than Google, Yahoo!, and Microsoft work. Probably similarly, but with lower ethical standards (due to less public attention ... or the need to get more of it?).
It seems as though there's quite a lot of research going on in the digital information retrieval area, particularly in multimedia. I wish I were a better programmer; it sounds really interesting.
Really, I just wish everyone would just use metadata and be honest about it.
In support of Honest Metadata, I present to you Eleanor Rubidium Chinchillington (who just goes by "Ella"):

Reading Response: I hadn't thought all that hard about how beefy a web crawler would have to be, given the volume of information the Big Three search engines index. Those are some mighty big numbers. And, you know, now that I'm thinking about it, of course they could shut down an entire domain if they did not have politeness algorithms in place. That they could shut down an entire country, I am still wrapping my head around. (To be fair, I still picture them as little spiders that "grab" links and report back information to big servers.)
I didn't see any mention of ill-mannered search engines, who ignore robots.txt files; I'm curious how search engines other than Google, Yahoo!, and Microsoft work. Probably similarly, but with lower ethical standards (due to less public attention ... or the need to get more of it?).
It seems as though there's quite a lot of research going on in the digital information retrieval area, particularly in multimedia. I wish I were a better programmer; it sounds really interesting.
Really, I just wish everyone would just use metadata and be honest about it.
In support of Honest Metadata, I present to you Eleanor Rubidium Chinchillington (who just goes by "Ella"):
Labels: 2670
Monday, October 6, 2008
[2670] Interesting article
LibraryJournal covers some of the controversy around Google Books:
http://www.libraryjournal.com/article/CA6601209.html?nid=3285
http://www.libraryjournal.com/article/CA6601209.html?nid=3285
Labels: 2670, libraries, technology
Saturday, October 4, 2008
[2670] Week 5 Reading Responses & Muddiest Point
Muddy point: I'm not sure I understand what an attribute is in XML. We saw a schema and how to set an attribute up in it, but not how it would look in the XML that fit the schema. (Apologies for imprecise terminology.)
Reading response (digital preservation - list below):
I don't know a lot about physical archiving, but it seems surprising to me that archivers didn't figure out immediately, when beginning digital projects, that of course the rules would be different, and different strategies would have to be employed. (Hindsight is 20/20, yes.)
I keep thinking that the problem of digital preservation shouldn't be so hard. We still have tape drives, after all--which are not so good for on-the-fly access, but do very well at storing large amounts of data for a long time. Why should 160GB hard drives, perhaps arranged RAID-style, not work out for a similarly large amount of time? Do we predict that we will move away from hard drives so soon? (We might! But it isn't as though we'd magically lose the ability to access them right away.) When you migrate to a new technology, migrate your archives, too; it seems logical. I don't think the cost of memory is going to skyrocket, any time soon. I wonder if maybe some kind of networked storage service is going to be the way to go--one institution does all the work for a bunch of others? (I'm thinking of a colloquium of universities more than a corporation, here.)
http://www.sis.pitt.edu/~dlwkshop/paper_hedstrom.pdf
http://www.dpconline.org/docs/lavoie_OAIS.pdf
http://www.dpconline.org/graphics/handbook/index.html
http://www.dlib.org/dlib/july07/littman/07littman.html
Reading response (digital preservation - list below):
I don't know a lot about physical archiving, but it seems surprising to me that archivers didn't figure out immediately, when beginning digital projects, that of course the rules would be different, and different strategies would have to be employed. (Hindsight is 20/20, yes.)
I keep thinking that the problem of digital preservation shouldn't be so hard. We still have tape drives, after all--which are not so good for on-the-fly access, but do very well at storing large amounts of data for a long time. Why should 160GB hard drives, perhaps arranged RAID-style, not work out for a similarly large amount of time? Do we predict that we will move away from hard drives so soon? (We might! But it isn't as though we'd magically lose the ability to access them right away.) When you migrate to a new technology, migrate your archives, too; it seems logical. I don't think the cost of memory is going to skyrocket, any time soon. I wonder if maybe some kind of networked storage service is going to be the way to go--one institution does all the work for a bunch of others? (I'm thinking of a colloquium of universities more than a corporation, here.)
http://www.sis.pitt.edu/~dlwkshop/paper_hedstrom.pdf
http://www.dpconline.org/docs/lavoie_OAIS.pdf
http://www.dpconline.org/graphics/handbook/index.html
http://www.dlib.org/dlib/july07/littman/07littman.html
Labels: 2670
Sunday, September 28, 2008
Dale's Nephew & My New Picasa
http://picasaweb.google.com/coral.hess/Elliot#
Labels: 2670, on a personal note
Friday, September 26, 2008
[2670] Reading comments and Muddiest point
I am really hoping we get the chance to use XML, for real. I've used HTML, but all of this reading about XML and schemas and stuff is not nearly hands-on enough for me to internalize any of it. It looks HTML-like, and it's all readable, I guess, but I'm not wrapping my head around how it would really be used. I want to have an XML parser in front of me and be able to build something and display it, you know?
Muddiest Point: This isn't from lecture, specifically, but I'm wondering: when is our midterm? Knowing that would help me plan out the rest of my semester a little better. Also, how are we going to handle there being 13 people and 12 computers?
Muddiest Point: This isn't from lecture, specifically, but I'm wondering: when is our midterm? Knowing that would help me plan out the rest of my semester a little better. Also, how are we going to handle there being 13 people and 12 computers?
Labels: 2670
Sunday, September 21, 2008
[2670] Week 4 Reading Responses
From Border Crossings (which I rather liked, in the sense that I'd like to sit down and chat with the author over a beer, despite the article's constant references to things I knew nothing about), "The answer [to the problem of metadata creation by untrained users] is that almost nobody will spend the time, and probably the majority of those who do are in the business of creating metadata-spam. Creating good quality metadata is challenging, and users are unlikely to have the knowledge or patience to do it very well, let alone fit it into an appropriate context with related resources. Our expectations to the contrary seem touchingly naïve in retrospect." -- Really? I thought we'd been finding the opposite. At least, that's what I recall Weinberger stating, in Everything is Miscellaneous. Del.icio.us was given as a specific example, wherever I saw that (although I have some doubts, I admit). Perhaps that's worth another look.
"People have to know and trust one another, which generally requires face-to-face engagement: transporting ourselves and our ideas to other time zones, surviving frequent-flyer-flues, finding the means to support travel costs, and missing baseball games of our children." -- My pulling out this quote is less relevant to metadata and more relevant to my opinions about the "online collaborative experience" Pitt tries to sell. More on that later, without the 2670 tag.
===========
"Although metadata is arguably a much less familiar term among creators and consumers of networked digital content who are not information professionals per se, these same individuals are increasingly adept at creating, exploiting, and assessing user-contributed metadata such as Web page title tags, folksonomies, and social bookmarks." -- Oh, hey, Introduction to Metadata suggests that maybe users aren't so bad at this. There you go, then; the truth is somewhere up in the air.
Metadata Encoding and Transmission Standard (METS) -- Want to find out more about this.
"As enunciated in Principle 6 of "Practical Principles for Metadata Creation and Maintenance" (p. 72), there is no single metadata standard that is adequate for describing all types of collections and materials; selection of the most appropriate suite of metadata standards and tools, and creation of clean, consistent metadata according to those standards, not only will enable good descriptions of specific collection materials but also will make it possible to map metadata created according to different community-specific standards, thus furthering the goal of interoperability..." -- Well, that isn't good news, precisely. I admit, I love panaceas.
I liked this reading; it contained loads of useful information and was both accessible and scholarly--a tricky thing to pull off. (Not sure I feel the need to drink beer with the author, though.)
===========
I liked how the book reading (I won't bother with bibliographic information here; nobody who doesn't already know this stuff will care to go find the source) said AACR2 is "almost persnickety." Just "almost." Right.
LaTeX is fantastic, by the way.
I also liked the discussion of automatic recognition and extraction of metadata. I wonder how much of this will be considered relevant for, for instance, our midterm...
"People have to know and trust one another, which generally requires face-to-face engagement: transporting ourselves and our ideas to other time zones, surviving frequent-flyer-flues, finding the means to support travel costs, and missing baseball games of our children." -- My pulling out this quote is less relevant to metadata and more relevant to my opinions about the "online collaborative experience" Pitt tries to sell. More on that later, without the 2670 tag.
"Although metadata is arguably a much less familiar term among creators and consumers of networked digital content who are not information professionals per se, these same individuals are increasingly adept at creating, exploiting, and assessing user-contributed metadata such as Web page title tags, folksonomies, and social bookmarks." -- Oh, hey, Introduction to Metadata suggests that maybe users aren't so bad at this. There you go, then; the truth is somewhere up in the air.
Metadata Encoding and Transmission Standard (METS) -- Want to find out more about this.
"As enunciated in Principle 6 of "Practical Principles for Metadata Creation and Maintenance" (p. 72), there is no single metadata standard that is adequate for describing all types of collections and materials; selection of the most appropriate suite of metadata standards and tools, and creation of clean, consistent metadata according to those standards, not only will enable good descriptions of specific collection materials but also will make it possible to map metadata created according to different community-specific standards, thus furthering the goal of interoperability..." -- Well, that isn't good news, precisely. I admit, I love panaceas.
I liked this reading; it contained loads of useful information and was both accessible and scholarly--a tricky thing to pull off. (Not sure I feel the need to drink beer with the author, though.)
I liked how the book reading (I won't bother with bibliographic information here; nobody who doesn't already know this stuff will care to go find the source) said AACR2 is "almost persnickety." Just "almost." Right.
LaTeX is fantastic, by the way.
I also liked the discussion of automatic recognition and extraction of metadata. I wonder how much of this will be considered relevant for, for instance, our midterm...
Labels: 2670
Friday, September 19, 2008
[2670] Muddiest Point
I don't have a muddiest point this week, beyond the question of whether we can use Picasa instead of Flickr, which has already been asked. It probably doesn't matter, since 5 pictures and 5 thumbnails is hardly a giant amount of information, but my Flickr account is getting full; pictures will start disappearing if I add too many. Anyway, I'm going to have a super cute feed, since I got to spend time with a less-than-one-day-old baby yesterday. I'll post here to let you know when I have the pictures up. :)
Labels: 2670
Friday, September 12, 2008
[2670] Muddiest Point
Muddiest point: Users of DLs are still in a client-server relationship, even if two DLs are in a p2p relationship, right?
Labels: 2670
[2670] Week 3 - Digital Object Identifiers
I am a cynic and a skeptic and a pessimist, and I'm aware of it. So it's no surprise to me--and if you've been reading for long, probably not to you, either--that I have very little hope for the URN or DOI idea ever really working out. (That is, the idea of giving every digital object a unique identifier, along the same lines as an ISBN/ISSN, instead of relying on URLs, which are subject to change. An important point about these identifiers: they wouldn't necessarily specify where to go to get any given Digital Object; they might just make one clearly discernible from another. Or a URN might resolve to multiple URLs.) I think managing something on that scale--a scale greater than that of DNS/URLs, since each object would be identified, not just each server--is going to be, to put it very plainly, more trouble than it's worth. There would be benefits to such a system, if it were ever fully deployed, sure, but how could it be done meaningfully?
Is this blog post a Digital Object (in the sense of having its own identifier or URN, in the hypothetical scenario where there is such a scheme)? And if I change it a year from now because I think my writing style is embarrassingly informal, has it become a different Digital Object? If you copy it down and put it into your blog--hopefully with attribution--is it the same Digital Object or a different one? (By my reading, it's the same, at least in the URN scheme. But when I go back and change my wording, it won't change the wording of the copy on your blog.)
As the authors of this article say, near the end, there's an awful lot left unresolved about this whole set of ideas. I think it's very pie-in-the-sky, a bit like Semantic Web. (Yeah, i went there.)
Is this blog post a Digital Object (in the sense of having its own identifier or URN, in the hypothetical scenario where there is such a scheme)? And if I change it a year from now because I think my writing style is embarrassingly informal, has it become a different Digital Object? If you copy it down and put it into your blog--hopefully with attribution--is it the same Digital Object or a different one? (By my reading, it's the same, at least in the URN scheme. But when I go back and change my wording, it won't change the wording of the copy on your blog.)
As the authors of this article say, near the end, there's an awful lot left unresolved about this whole set of ideas. I think it's very pie-in-the-sky, a bit like Semantic Web. (Yeah, i went there.)
Labels: 2670, classes, libraries, library school, technology
Friday, September 5, 2008
[2670] Week 2 Reading Responses
Overall themes: interoperability, modularity.
A Framework for Building Open Digital Libraries has me totally sold on the ODL concept and on the extension of the Open Archives Initiative Protocol for Metadata Harvesting (OAI_PMH) to build every future Digital Library ever. I think it's a great idea; interoperability is a desirable thing. My one critique is that their very simple mock-up and animated gif detracted, a little bit, from the picture they were painting. Perhaps I am unnecessarily picky.
Architecture for Information in Digital Libraries is interesting enough, but I'd love to know what they've done in the last decade. As I was reading, I found myself wondering if the meta-object to object link worked in the opposite direction; that is, whether pulling up an object would pull up a link to its meta-object (for instance, if the object is part of a larger collection). I would think it would come up on the catalog page when a search is done, but I was just surprised not to see them point that out explicitly.
I smiled when I saw that they based RAP on CORBA. That was the big thing, back then. And it stayed big for quite a while; I imagine it's still fairly widely used nowadays, even. (Though I admit, I really don't know. I hear something [neither a protocol nor a language] called "SOA" is in vogue, now, but I don't delve into specifics.)
As I read through Interoperability for Digital Objects and Repositories, I begin to be grateful that our reading list was put in the order it was. They just whip through those acronyms. But I like the structure of their experiment, and I admit, I was holding my breath, a little bit, wondering whether they would find their systems interoperable--even after extending them (if that's the right conjugation of the verb that goes with "extensibility"). Again, I began to get worried, until, finally, in their last paragraph, they mentioned their plan to add access management. (I know if I were curating a DL or DA, I wouldn't want to grant remote locations the ability to add digital objects except in very specific ways.)
I decided that the broken link in Blackboard must have meant to refer to this particular description of the Internet.
I'm pretty familiar with web technology, so I didn't find too much to say about this article. I think he's a little bit overzealous in his defense of Internet-as-proto-DL; the truth lies somewhere between his statements and the statements he derides. There's hope for the 'net, but I could definitely see it going either way, at this point.
(A lighthearted aside: "Recently, attempts have been made to rewrite the history of the Internet ... and for individuals to claim responsibility for achievements that many shared." Hey, now! That quote was taken out of context! He was joking!)
I have another aside, not strictly relevant to this article, but the discussion of Los Alamos brought it to mind. I've seen several articles--including a required reading for Understanding Information--that suggest that the sciences are all progressive, all sharing their information immediately and collaboratively over the Web, but I just don't see it. At least in engineering, which, despite Kuhn's disparagement in The Structure of Scientific Revolutions, is a subset of "the sciences" (seriously, ask me about my research), we tended to hold our papers--and with it our most recent research results--until a conference accepted them. And then the conferences (really, the IEEE) required that we not post the papers anywhere else. (That's what I recall, anyway.) With conference deadlines being six months or more before the conferences, themselves, I really feel that this "real-time collaboration" people talk about it is not particularly widespread.
Don't misunderstand me: I'm in favor of it. But the current methods of determining tenure, hooding, and so on would have to change significantly before a "share and share alike" system will really become tenable.
A Framework for Building Open Digital Libraries has me totally sold on the ODL concept and on the extension of the Open Archives Initiative Protocol for Metadata Harvesting (OAI_PMH) to build every future Digital Library ever. I think it's a great idea; interoperability is a desirable thing. My one critique is that their very simple mock-up and animated gif detracted, a little bit, from the picture they were painting. Perhaps I am unnecessarily picky.
Architecture for Information in Digital Libraries is interesting enough, but I'd love to know what they've done in the last decade. As I was reading, I found myself wondering if the meta-object to object link worked in the opposite direction; that is, whether pulling up an object would pull up a link to its meta-object (for instance, if the object is part of a larger collection). I would think it would come up on the catalog page when a search is done, but I was just surprised not to see them point that out explicitly.
I smiled when I saw that they based RAP on CORBA. That was the big thing, back then. And it stayed big for quite a while; I imagine it's still fairly widely used nowadays, even. (Though I admit, I really don't know. I hear something [neither a protocol nor a language] called "SOA" is in vogue, now, but I don't delve into specifics.)
As I read through Interoperability for Digital Objects and Repositories, I begin to be grateful that our reading list was put in the order it was. They just whip through those acronyms. But I like the structure of their experiment, and I admit, I was holding my breath, a little bit, wondering whether they would find their systems interoperable--even after extending them (if that's the right conjugation of the verb that goes with "extensibility"). Again, I began to get worried, until, finally, in their last paragraph, they mentioned their plan to add access management. (I know if I were curating a DL or DA, I wouldn't want to grant remote locations the ability to add digital objects except in very specific ways.)
I decided that the broken link in Blackboard must have meant to refer to this particular description of the Internet.
I'm pretty familiar with web technology, so I didn't find too much to say about this article. I think he's a little bit overzealous in his defense of Internet-as-proto-DL; the truth lies somewhere between his statements and the statements he derides. There's hope for the 'net, but I could definitely see it going either way, at this point.
(A lighthearted aside: "Recently, attempts have been made to rewrite the history of the Internet ... and for individuals to claim responsibility for achievements that many shared." Hey, now! That quote was taken out of context! He was joking!)
I have another aside, not strictly relevant to this article, but the discussion of Los Alamos brought it to mind. I've seen several articles--including a required reading for Understanding Information--that suggest that the sciences are all progressive, all sharing their information immediately and collaboratively over the Web, but I just don't see it. At least in engineering, which, despite Kuhn's disparagement in The Structure of Scientific Revolutions, is a subset of "the sciences" (seriously, ask me about my research), we tended to hold our papers--and with it our most recent research results--until a conference accepted them. And then the conferences (really, the IEEE) required that we not post the papers anywhere else. (That's what I recall, anyway.) With conference deadlines being six months or more before the conferences, themselves, I really feel that this "real-time collaboration" people talk about it is not particularly widespread.
Don't misunderstand me: I'm in favor of it. But the current methods of determining tenure, hooding, and so on would have to change significantly before a "share and share alike" system will really become tenable.
Labels: 2670, classes, engineering, libraries, library school, technology
Thursday, August 28, 2008
[2670] Week 1 Reading Responses
For this week, I'm commenting informally and responding to things as I read them. Not "First I will read the article, and then I will post my fully processed thoughts," but "Hey, that sentence seems worth commenting on. Here's what I think." I see some benefit to both approaches, and I hope to try both throughout the semester. For that matter, when I get Understanding Digital Libraries (Lesk), I may do that with Chapter 1. (Yes, this will be posted without thoughts on Chapter 1. If I have thoughts worth posting, I'll come back and edit this post, though.)
If this seems far too informal, please feel free to drop me a note (coral dot hess at gmail dot com), and I'll cut it out and keep my posts more to the point.
---------
I take issue with the author of Digital Libraries and the Problem of Purpose's attitude about what a public library is. It seems like he waves away precisely the roles I would say the library fills, "an all-purpose information center, ... a community center, ... a center for adult education, [and] ... the guardian of free speech," to make a claim I take real issue with: "... public libraries finally began to come to terms with their more limited but realistic purpose: to be suppliers of books to the middle class and a symbol of culture in the community." (He was, in turn, citing a private communication, something that perhaps wasn't intended to be cited or taken as speaking for every public library, to such a large audience.) I don't think that's a fair description, at all. Libraries are far more than a symbol; in many places, they very much serve as the anchors and educators of their communities. To refer to them as "symbol[s] of culture" is, I think, the tiniest bit condescending. Further, I read something mildly disparaging in the use of the term "middle class," as though a library ought only to cater to some other group, instead. And, if it were the case that the middle class were the only users of the library, I could see where he might be coming from. It's my experience, though, that public libraries serve a more diverse set of economic groups than "the middle class."
His over-arching point, that we should think a bit about what digital libraries are and should be--what we should try to make them--I have no issue with. He's probably right. My suggested approach to dealing with digital libraries, "Grow all collections, including digital ones, based on user needs/demands and technological innovation, organically--but also intelligently," is no doubt naive. Or at least lacking in detail. I understand that. And I'm looking forward to refining that viewpoint throughout the semester and the next few (many?) years.
---------
I love that Automated Digital Libraries discusses the Internet Archive, or Wayback Machine.
"Disintermediation" is an interesting term; I think it has an unnecessarily negative sound to it, as though it is the librarian's right to serve as the gateway between a patron and the information they seek. That doesn't seem right to me. I, for one, am not offended by patrons who can find information without a librarian's help, although I am always happy to step in and help when I can.
I'm going to have to think a bit about the observation that most of the successful automated digital library projects--at least, the ones he deems worth mention--were made outside of and separate from established libraries. That interests and concerns me. But I wonder how much it really should concern me...
---------
In Dewey Meets Turing, I found the following quote intriguing: "librarians who involved themselves in the Initiative understood that information technologies were indeed important to ensure libraries' continued impact on scholarly work." If this article has it right, the concern was that libraries would be left out in the cold, so to speak. "We have to upgrade and innovate, or else we will be left behind and no longer have an impact," it seems to say. There is, perhaps, some worry about job security implied. I guess it stood out so much to me, because the previous article discussed automated digital libraries, with no need for librarians. So we build digital libraries to secure the jobs that will be replaced by very large computers? (I know I'm oversimplifying, but that's what went through my head as I read it.)
I'm only about 2/3 of the way through, as I make this comment, but I wonder: am I reading too much into this article, or is it kind of painting CS types as heroes and LIS types as traditionalists? I've got some background in CS, and I can't find any one comment I disagree with. It's just a tonal thing, possibly imaginary.
"The notion of collections is spontaneously re-emerging in the form of what computer scientists have named information 'hubs.'" Really? I'm having trouble thinking of an example of a "hub." (And, I confess, not having an ACM account or the patience to go through Pitt's or CMU's VPN, I didn't read the article linked to the word "hub.") Do any readers have an example? I feel like there's probably some obvious thing, but all that's coming to mind is Wikipedia--something that certainly is not curated in any meaningful way.
I like the authors' hopeful tone.
---------
From Gutenberg to the Global Information Infrastructure, I think, missed out on a real opportunity. s/Infrastructure/Grid/, and you get a far better-sounding title. (I kid.)
I have to admit, this "sniping" between CS people and LIS people keeps coming up, and even though it makes perfect sense, I am a little surprised to learn about it. I can easily see how it comes about, now that I have occasion to think about it; I just hadn't thought about it before, I suppose.
Note to self: check out http://memory.loc.gov.
This seemed important enough to call out: "Griffiths (1998) confronts the question of 'why the web is not a library.' Her rea-
sons include incompleteness of content, lack of standards and validation, minimal cataloging, and ineffective information retrieval. To this I add that the World Wide Web is not an institution and is not organized on behalf of a specifiable user community." Indeed. This is a very good answer.
I admit, this whole reading (Chapter 2, for those following along at home) was a bit of a blur for me. It is focused pretty much entirely on definitions and semantics, something I have really limited interest in. I do realize the importance of the discussion (as the author himself says, "Words do matter, however, and they will influence the success of our ventures"). But I'm going to have to look over this reading again, at a later date, if I really want to absorb it all.
Still, I feel like the whole point is captured very well in this summary (copied from the book): "From a research perspective, digital libraries are content collected and organized on behalf of user communities. From a library-practice perspective, digital libraries are institutions or organizations that provide information services in digital forms. Definitions are formulated to serve specific purposes. The research community's definitions serve to identify and focus attention on research problems and to expand the community of interest around those problems. The library community's definitions focus on practical challenges involved in transforming library institutions and services. Databases available on the Internet, on proprietary services, and on CD-ROMs fall into a gray area."
---------
Before I get too far into Setting the Foundations of Digital Libraries, I have to admit, I have a warm spot in my heart for Manifestos. Or, at least, I smile when I read the word, for no good reason I can name.
I like that this paper presents the different ideologies not as CS vs. LIS, but as having "shifted" from one to the other.
Ha: "this terminological imprecision has produced a plethora of heterogeneous entities..." (They're absolutely right, of course. But you've gotta love that phrasing.)
I like that they are trying to structure the conversation about digital libraries and standardize the terms that are used. I don't feel as though their diagrams convey all that much extra information, to be honest, and their text could more or less stand alone.
I think internalizing all of their terminology and using it consistently would be a good idea. (Dr. He, it's been over a year since this paper came out--and longer since the Manifesto--does it seem as though the DL community at large is using this terminology? Do American and European papers differ, much, in their adherence to this vocabulary?) That would take me, at least, some time. But having structured language is important and worthwhile.
If this seems far too informal, please feel free to drop me a note (coral dot hess at gmail dot com), and I'll cut it out and keep my posts more to the point.
I take issue with the author of Digital Libraries and the Problem of Purpose's attitude about what a public library is. It seems like he waves away precisely the roles I would say the library fills, "an all-purpose information center, ... a community center, ... a center for adult education, [and] ... the guardian of free speech," to make a claim I take real issue with: "... public libraries finally began to come to terms with their more limited but realistic purpose: to be suppliers of books to the middle class and a symbol of culture in the community." (He was, in turn, citing a private communication, something that perhaps wasn't intended to be cited or taken as speaking for every public library, to such a large audience.) I don't think that's a fair description, at all. Libraries are far more than a symbol; in many places, they very much serve as the anchors and educators of their communities. To refer to them as "symbol[s] of culture" is, I think, the tiniest bit condescending. Further, I read something mildly disparaging in the use of the term "middle class," as though a library ought only to cater to some other group, instead. And, if it were the case that the middle class were the only users of the library, I could see where he might be coming from. It's my experience, though, that public libraries serve a more diverse set of economic groups than "the middle class."
His over-arching point, that we should think a bit about what digital libraries are and should be--what we should try to make them--I have no issue with. He's probably right. My suggested approach to dealing with digital libraries, "Grow all collections, including digital ones, based on user needs/demands and technological innovation, organically--but also intelligently," is no doubt naive. Or at least lacking in detail. I understand that. And I'm looking forward to refining that viewpoint throughout the semester and the next few (many?) years.
I love that Automated Digital Libraries discusses the Internet Archive, or Wayback Machine.
"Disintermediation" is an interesting term; I think it has an unnecessarily negative sound to it, as though it is the librarian's right to serve as the gateway between a patron and the information they seek. That doesn't seem right to me. I, for one, am not offended by patrons who can find information without a librarian's help, although I am always happy to step in and help when I can.
I'm going to have to think a bit about the observation that most of the successful automated digital library projects--at least, the ones he deems worth mention--were made outside of and separate from established libraries. That interests and concerns me. But I wonder how much it really should concern me...
In Dewey Meets Turing, I found the following quote intriguing: "librarians who involved themselves in the Initiative understood that information technologies were indeed important to ensure libraries' continued impact on scholarly work." If this article has it right, the concern was that libraries would be left out in the cold, so to speak. "We have to upgrade and innovate, or else we will be left behind and no longer have an impact," it seems to say. There is, perhaps, some worry about job security implied. I guess it stood out so much to me, because the previous article discussed automated digital libraries, with no need for librarians. So we build digital libraries to secure the jobs that will be replaced by very large computers? (I know I'm oversimplifying, but that's what went through my head as I read it.)
I'm only about 2/3 of the way through, as I make this comment, but I wonder: am I reading too much into this article, or is it kind of painting CS types as heroes and LIS types as traditionalists? I've got some background in CS, and I can't find any one comment I disagree with. It's just a tonal thing, possibly imaginary.
"The notion of collections is spontaneously re-emerging in the form of what computer scientists have named information 'hubs.'" Really? I'm having trouble thinking of an example of a "hub." (And, I confess, not having an ACM account or the patience to go through Pitt's or CMU's VPN, I didn't read the article linked to the word "hub.") Do any readers have an example? I feel like there's probably some obvious thing, but all that's coming to mind is Wikipedia--something that certainly is not curated in any meaningful way.
I like the authors' hopeful tone.
From Gutenberg to the Global Information Infrastructure, I think, missed out on a real opportunity. s/Infrastructure/Grid/, and you get a far better-sounding title. (I kid.)
I have to admit, this "sniping" between CS people and LIS people keeps coming up, and even though it makes perfect sense, I am a little surprised to learn about it. I can easily see how it comes about, now that I have occasion to think about it; I just hadn't thought about it before, I suppose.
Note to self: check out http://memory.loc.gov.
This seemed important enough to call out: "Griffiths (1998) confronts the question of 'why the web is not a library.' Her rea-
sons include incompleteness of content, lack of standards and validation, minimal cataloging, and ineffective information retrieval. To this I add that the World Wide Web is not an institution and is not organized on behalf of a specifiable user community." Indeed. This is a very good answer.
I admit, this whole reading (Chapter 2, for those following along at home) was a bit of a blur for me. It is focused pretty much entirely on definitions and semantics, something I have really limited interest in. I do realize the importance of the discussion (as the author himself says, "Words do matter, however, and they will influence the success of our ventures"). But I'm going to have to look over this reading again, at a later date, if I really want to absorb it all.
Still, I feel like the whole point is captured very well in this summary (copied from the book): "From a research perspective, digital libraries are content collected and organized on behalf of user communities. From a library-practice perspective, digital libraries are institutions or organizations that provide information services in digital forms. Definitions are formulated to serve specific purposes. The research community's definitions serve to identify and focus attention on research problems and to expand the community of interest around those problems. The library community's definitions focus on practical challenges involved in transforming library institutions and services. Databases available on the Internet, on proprietary services, and on CD-ROMs fall into a gray area."
Before I get too far into Setting the Foundations of Digital Libraries, I have to admit, I have a warm spot in my heart for Manifestos. Or, at least, I smile when I read the word, for no good reason I can name.
I like that this paper presents the different ideologies not as CS vs. LIS, but as having "shifted" from one to the other.
Ha: "this terminological imprecision has produced a plethora of heterogeneous entities..." (They're absolutely right, of course. But you've gotta love that phrasing.)
I like that they are trying to structure the conversation about digital libraries and standardize the terms that are used. I don't feel as though their diagrams convey all that much extra information, to be honest, and their text could more or less stand alone.
I think internalizing all of their terminology and using it consistently would be a good idea. (Dr. He, it's been over a year since this paper came out--and longer since the Manifesto--does it seem as though the DL community at large is using this terminology? Do American and European papers differ, much, in their adherence to this vocabulary?) That would take me, at least, some time. But having structured language is important and worthwhile.
Labels: 2670, library school
Tuesday, August 26, 2008
[2670] Muddiest Point

In lecture, Dr. He mentioned that we should post the "Muddiest Point" from each of [at least ten of] his lectures. With 14 or so weeks of class, there is room for us not to post it each week.
The first lecture seemed very accessible, and I found very little to ask questions about. That's pretty fantastic, except it makes me worry that maybe not everyone will be able to come up with a Muddiest Point for 10 out of 14 lectures (give or take). Maybe some subset of us will find everything as understandable as Lecture 1, in which case... what should we do? Should we maybe come up with a question related to, but not really covered in, that lecture? Or should we perhaps just point out what we thought was the most important topic covered in that lecture? Or do we just not post a Muddiest Point at all, that week?
(Is this too meta to count as a Muddiest Point? I admit, I'm probably worrying unnecessarily about this. Surely, we'll wander into the weeds and find confusion, soon enough. :))
Labels: 2670, library school
