January, 2022

The Scholarly Kitchen has a piece about Google Books and their failed attempt to scan it all. It is a review of Along Came Google: A History of Library Digitization by Deana Marcum and Roger Schonfeld, published last September by Princeton University Press. (I wrote about Google Books a few years ago.)

Now of course it’s a pity we don’t have free access to every book published in the past — but our loss is surely fairly small in the scheme of things. And be it noted that often the scanning may have a flaw or two. This scan,

of the map of Galashiels in 1795 facing page 76 in Robert Hall’s History of Galashiels, is frustrating, but ultimately pretty irrelevant in a free Internet Archive version of Google’s scan. However if you had paid $65.32 for the Facsimile Publisher’s edition of the book, you might be less charmed to find this hand in place of the map. Of course, as I haven’t paid $65.32 I don’t know that they didn’t take some remedial action, but I suspect many a fly-by-night publisher wouldn’t even be aware of such an issue. See Forgotten books for some examples of publishing morality.

In a way, whether we have digitized versions of everything, or just bits of everything, is an archetypical librarian’s issue. For most of us the question is, is there a digitized version of book X, which I want to look at right now? If there is, great: if not, we move on. Having online access to everything ever written would be great, but can you not remember that there once was a time when if you wanted to see what someone had written you had to find a library, find the book, sit down and open it? Amazingly we somehow managed, and might I suggest, still can.

Last year Spice DAO* invested $3 million in a copy of a rare book, Jodorowsky’s Dune. This book is no doubt the script and storyboards for Chilean film director Alejandro Jodorowsky’s movie based on Dune by Frank Herbert which slouched towards a birthing some time in the 1970s. Apparently only ten or so copies were made.

The story brought to us by Plagiarism Today implies that Spice DAO can’t tell the difference between “a book” as a physical object, and “a book” as content, in which form it is of course protected by the law of copyright. But the story seems to be more nuanced than Plagiarism Today suggests. At The Verge, Adi Robertson gives us a more sober analysis. Turns out that Spice DAO know exactly what it is they have and haven’t bought, and understand perfectly well that among the things they haven’t bought are any intellectual property rights. What Ms Robertson suggests they have bought is a modicum of publicity. Maybe there will be a rush to buy NFTs of individual pages of the book which it appears is one of the ways Spice DAO plan to recoup their investment. There may be an eagerness to invest in NFTs which outstrips our understanding of what an NFT is.

A Non-Fungible Token is a unique digital address stored on a blockchain. If anyone was silly enough to pay me for it, I could in theory sell them “ownership” of this photo of our apartment window showing the top of our Christmas tree. Their ownership would be secure, guaranteed by the power of blockchain. So in exchange for a nice cheque you can become the undisputed, and indisputable, owner of this picture. Sounds irresistible, doesn’t it?

Ownership of this picture would mean little. I could if I wanted sell another NFT of it to your neighbor, and if you sought to reproduce the photo I could sue you for copyright infringement. While you could resell the NFT itself, if you hoped to sell a copy of it, you couldn’t, nor could you create another NFT of it. Of course you could in theory bargain with me to acquire the rights to do these things — if you could afford them — but I suspect much of the excitement around NFTs these days is based on the assumption that these rights are automatically included in the purchase. Lots of people are probably going to wake up one day and regret their expenditure on NFTs of individual pages from Jodorowsky’s Dune.

Just who’d own the rights to the story bible of a movie that was never made isn’t altogether obvious, but it is the sort of thing that could be found out one would think. So supposedly permission to do something could be sought from someone somewhere. Mr Jodorowsky is still with us, approaching his 93rd birthday. Confusingly there is actually a movie called Jodorowsky’s Dune. This is a documentary about Mr Jodorowsky’s efforts to make the film adaptation of Dune, which effort collapsed when the director refused to compromise on length: the studio liked two hours, he wanted ten to fourteen. He hoped to give roles to Salvador Dali and Orson Welles and have the music taken care of by Pink Floyd. In the end, David Lynch’s version of Dune came out in 1978, and of course there’s now a mini-series making the rounds.

An NFT of a book sounds rather unlikely. Here I find myself in agreement with Kristine Kathryn Rusch “NFTs have the feeling of a fad (at best) or a scam (at worst) to me.” Maybe there’s a place for them in the computerized areas of the graphic arts, but a book sounds about the opposite of an exclusively owned artwork. Maybe the manuscript might be an apt subject for NFT-ification, but would you want to own an NFT of the manuscript of The Sound and the Fury, rather than the actual manuscript? But maybe I’m missing the whole point of NFTs.

And in the meantime $3 million sounds like a risky investment.

See also NFTs from a couple of months ago.


* DAO stands for “decentralized autonomous organization”. It is described at Wikipedia as “an organization represented by rules encoded as a computer program that is transparent, controlled by the organization members and not influenced by a central government. A DAO’s financial transaction record and program rules are maintained on a blockchain”

“A blockchain is a growing list of records, called blocks, that are linked together using cryptography. Each block contains a cryptographic hash of the previous block, a timestamp, and transaction data (generally represented as a Merkle tree). The timestamp proves that the transaction data existed when the block was published in order to get into its hash. As blocks each contain information about the block previous to it, they form a chain, with each additional block reinforcing the ones before it.”

Publishers Weekly tells us about a recent U.S. Census Bureau report on physical bookstores: “Bookstore sales fell from a peak of $16.8 billion in 2007 to $9.12 billion in 2019, a decline of 45% (bookstore sales were $6.50 billion in 2020 because of the impact of the pandemic, but are on track to be close to 2019 levels in 2021). During that same span, the number of stores fell about 40%, from just under 10,000 to 6,045 in 2019. . . . Based on data from the 2017 Economic Census (a study conducted every five years), books made up 70.9% ($7.1 billion) of the total of $10 billion in bookstore sales in 2017. Among the products besides books that were sold in meaningful quantities at bookstores were: office and school supplies and packaging materials ($412.3 million, 4.1%); toys, games, and hobby supplies ($378 million, 3.8%); meals and beverages ($305.7 million, 3.1%).”

The question is: What’s going on? Are things getting better or worse? The pandemic does mask reality a bit, but the trend was clearly underway before that. If we accept 2019 as a baseline, we’re not doing too badly today, but of course 2019 was 45% down on 2007. It almost seems quaint to need to be reminded, but don’t forget that the closure of chain bookstores and the rise (not unconnected) of Amazon were already driving industry change.

I’m surprised at the large brown slice that is textbooks, almost as big as the trade book slice, but of course, unless you are a student, it’s easy to overlook the college bookstore. Our nearest bookshop is in fact a college bookstore, and it does offer a selection of non-text books. Sobering is the tiny pale blue sliver that contains all academic, professional, and reference books (as well as audio and ebooks — though nobody would expect many audio or ebook sales to take place in a bricks-and-mortar store. Are we moving, even more strongly than ever, towards a world where academic and professional books are bought through Amazon (or let’s just say online retail) where you can expect the widest selection of more specialized books, while bookstores remain the place you’d go to pick up the latest bestseller? While it’s nice for a university press to think that their less specialized titles will show up in a bookshop, realistically speaking you are much more likely to find a copy of such a book by going online. For a bookshop to stock academic books represents a faith-leap rather than a sound business strategy, and it’s likely soon to become so rare as to be effectively non-existent, if it hasn’t already.

A straw in this wind, and a fairly large straw at that, is the fact that Barnes & Noble’s comeback seems to be going well. Obviously everyone knows you can buy books online, and maybe people carry out some sort of research into what they might want to get before they walk into their local bookshop — but lots of people do seem to want to handle the goods before they buy their next read. Maybe there will remain sufficient demand to keep many/most bookstores going once things settle down after pandemic disruptions.

In Bookstore futures I asked last May if there was any data to show that publishers sales had improved as a result of increased ebook sales. Seems not. Indeed Publishers Weekly just disclosed to us that “After rising 11% in 2020, e-book sales declined 3.7% last year and the format’s share of adult sales dropped from 17.1% in 2020 to 14.7% last year.” If you can’t access the PW story because of paywalls here’s a link to Publishing Perspectives’ version.

Kristine Kathryn Rusch gives us an analysis of recent developments in indie publishing.*

Ms Rusch claims that the pandemic and its disruptions have caused the world of indie publishing (by which we of course mean non-traditional publishing, what used to be called self publishing tout court) to split into five different parts. These parts are (I think):

  • Actual self-publishing
  • Individual data managers
  • Small writer-owned publishers
  • Small traditional publishers
  • Small entertainment companies

Her second category, Individual data managers, seems to refer to self publishers who are adept at manipulating social media in order to maximize the sale of their books. Now, it goes without saying that some authors will be more skilled at digital manipulation than others, but I don’t really see how a lack of conscience should elevate them to s separate analytical grouping. “Actual self-publishers” will be more or less successful in their sales efforts just as you might argue that Penguin Random House were more successful in their sales efforts than many other presses. That doesn’t mean that PRH and Podunk University Press are not both traditional publishers.

Her fifth group, Small entertainment companies, seems to be constructed to contain Wattpad and Open Road. Not sure I understand why these companies (as well as Small traditional publishers) exemplify self publishing, but my analytical eyes may just be failing to divide the world up properly here!

Seems to me Ms Rusch ignores one or two ancient strands of indie publishing. One is the publishing of books by a club, society, charitable group or whatever. I’ve got a copy of Galashiels: A Modern History published by The Ettrick and Lauderdale District Council in 1983. The first edition was published in 1898 by the Galashiels Manufacturers Corporation, an organization of woolen manufacturers. The authoritative Royal Doulton Figures Produced at Burslem c1890-1978 was published in 1978 by Royal Doulton Tableware Limited. I also have a cookbook published in 1975 by St Luke’s School in Manhattan. They did it themselves (with help, I dare say, from parents with industry connections). The one I’ve got from Newnham Croft School in Cambridge, England is a slicker job and was “published” by Imprimata Publishers Ltd. which is another category ignored by Ms Rusch — providers of services to self-publishers. (They are now out of business I think.) See Self published or privately printed for lots of examples of this sort of thing, including my German teacher’s Grammar Notes.

Another line of business she ignores is vanity publishing. This may be seen as having been superseded by what we now (perhaps) call Hybrid publishing. The money that used to be earned by vanity publishers is now being earned by service providers who exist in order to help authors get from manuscript to print. I dare say many of the individual players are the same people.

My beef with Ms Rusch’s piece isn’t really that five’s the wrong number. It’s that any number, less perhaps than infinity, is the wrong number. Gaul was never divided into three parts: it’s just that Caesar chose to analyze it that way. Ms Rusch calls for five parts; my “Hybrid publishing” post called for six, and in light of what I write here that may be inadequate. I think the problem with all this is the impulse to divide it all up at all. Publishing has evolved into a widely varied business. Trying to divide it up into groups is a waste of time as there will always be exceptions which could be fitted in between a couple of anyone’s categories. Just leave it alone and get on with your publishing. All the different solutions will happily coexist.


* Typically of course Ms Rusch is unable to resist a passing side-swipe at traditional publishers. While indie publishers are exemplarily digital, “Traditional publishing is still fighting hard against that, however. (Sigh again).” She sighs, as do I in reporting her beating of this dead (or imaginary) horse. If traditional publishers do not turn their entire output over to the ebook format, this is not because they are prejudiced against it, or disgusted and scared of the implications of digital, (a format in which their profits can be maximized). It is because that is what their customers want. (Sigh yet again.) When their customers stop wanting print books, publishers will stop providing them. Duh!

One might, for symmetry’s sake, mention that indie publishers strongly favor the ebook format, and the likes of Ms Rusch in Reverse might accuse them of being frightened of print books. Of course a few indie publishers do offer them, especially via print-on-demand, but for an individual author/publisher having to store cartons of printed books in your coat closet and mail them out whenever someone wants to buy one is not a really viable option. Thus we do ebooks. Not that they do, but traditional publishers, if they had the self-justificatory need, might complain at the Indies, that they are blindly, stupidly ignoring the format that readers still want most. The one complaint is as stupid as the other.

Publishers Weekly has conducted an analysis of 2021’s bestsellers. 2021 was a banner year for publishers: obviously it was an extra-banner year for Penguin Random House. But, given how the company was set up, which one of us can claim surprise? Hardbacks and paperbacks do show a different pattern: but which one would you rather dominate?

There are lots of other breakdowns at the article. Bear in mind that this is just a calculation based on the number of appearances in a PW bestseller list: No. 1 is no more valuable here than No. 5, except for the probability that the No. 1 title may linger in the list for more weeks as it rises and falls, and thus rack up more mentions.

From Robert Hall: The History of Galashiels (Published on subscription by the Galashiels Manufacturers Corporation, 1898).

“The first library in the town was founded by Dr Douglas in 1797, and was termed the Galashiels Subscription Library. The first minute-book is lost; the first entry in that yet existing is dated 20th November, 1827. The rules provided that members had to pay an entrance fee of five shillings, besides an annual subscription of four shillings. Those falling in arrears for eighteen months were expelled, and fines were levied upon those who failed to attend the annual meeting, or who allowed a non-member the use of a book belonging to the Library. No books hostile to revealed religion or of an immoral tendency, nor those treating on divinity, law, physic, or politics could be acquired unless ordered by a majority of members at a general meeting. The Library was open even Tuesday, Thursday, and Saturday, between the hours of nine and ten, morning; two and three, afternoon; and six to eight in the evening.

The Library was originally kept in the Old Town, and William Hislop was librarian. When it was located there, Sir Walter Scott was a frequent visitor, and, in answer to some question regarding it, David Thomson thus replied,—

'We hae nae many books in vogue,
As you'll see by the catalogue,
In truth our funds are rather spare,
At present we can do name mair,
We're ruined quite in oor finances
Wi' your bewitching, famed romances.'

On the removal of the Hislops to Bridge Street, William Gill was appointed librarian, and the books were transferred to Overhaugh Street. It would have been interesting to learn the nature and class of books in demand in the early years of the Library. The first mentioned list of new books occurs in 1827, when the following works were acquired:— The London Mechanics’ Magazine, the third volume of Byron’s Works, Tales of a Grandfather, Chalmers’ Pictures of Scotland, Travels and Voyages of Columbus, Nicholson’s Mechanic, and Gill’s History of Greece. What remuneration the librarian received is not stated till 1837, when it was fixed at £5, 10s per annum. In 1840 Mr Gill resigned on account of the members calling upon him for books at any time that suited their own convenience. Finding no one willing to undertake the duty, the committee requested Mr Gill to continue in office, which he did on condition that his salary was raised to £7. . . .

In 1843 there were 150 members, and in 1847 the salary of the librarian was increased to £10, 13s. In  1847 Mr Gill again resigned, and Edward Gray, painter, High Street, fulfilled the duties for the sum of £8 annually. Owing to a decrease in membership, the committee decided to admit readers on payment of one shilling and sixpence quarterly. The magazines read in 1850 were <em>The Dublin University Magazine</em>, Blackwood’s, Tait’s, Edinburgh Review, and Hogg’s Instructor. There were at this time one hundred and and two members and thirty-nine readers. Notwithstanding the large additions made to the number of books, the membership declined, till in 1854 it had fallen to eighty-four and sixteen readers. With the view of attracting members, the entry money was reduced to two shillings and sixpence, but this proved of no avail. In 1859 the committee resolved to wind up the Library affairs and divide the books, all members in arrears being debarred from participating in the division. The number of books amounted to 3000, which were put up in lots corresponding to the number of members. A ballot took place, and the Library was dissolved, a considerable number of the volumes finding their way into the Mechanics’ Library.”

The Mechanics’ Library had been established in 1837, and kept going till 1873, when it gave up in face of the completion of the free Public Library which had been established under the Public Libraries Act of 1850, but wasn’t opened till 1874. The building below dates from 1889, I think. On the map below it’s situated just to the left of the Corn mill at the bottom.

Galashiels Public Library

Galashiels was in the eighteenth century a little village on the edge of the flood plain of Gala Water a couple of miles from the Tweed, squashed in between the river and the Laird of Gala’s policies (estate). I think it’s quite impressive how serious the villagers obviously were. Here’s part of a map of the town in 1824 which shows only a couple of the big woolen mills which came to dominate the town by the end of the century.

Detail of John Wood’s 1824 map. From National Library of Scotland

Montaigne’s library was located in this circular tower at the Château de Montaigne, near Bergerac and Saint-Emilion. You can take a virtual tour of a reconstruction of the library — in the sense of a room — at the Musée d’Aquitaine’s website. Click on it and you can drag the cursor around to rotate the display.

Moving from the sense of library as a room, to its meaning as a collection of books, Cambridge University Library’s Montaigne’s library contains Gilbert de Botton’s attempt at assembling all the books in Montaigne had in that room. Their collection includes ten copies which were actually owned by Montaigne, and links take you to digitized versions of a few of them.

I guess a computer is really more complicated than a Linotype machine, but it’s all inside that black box. It’s like magic: you don’t see any action. With the heavy metal machinery we used to use to make things, you’d see (and hear, and smell, and feel) lots of stuff happening. It is just amazing to get great chunks of metal, working at high temperatures and relatively high speeds, click-clacking and bang-thumping away, and despite their heavy steel structure, creating precise little things regularly, repeatedly, and reliably. Designing a machine with almost ten thousand moving parts is amazing enough: getting it to work every morning at the hands of a wage-earner, and getting it to keep on working for years and years requires dedication and love. No wonder the compositors were the aristocracy of labor.

The Museum of Printing has a series of ten videos about the Linotype machine. These are for the enthusiast — fourteen minutes on lubricating an obsolete machine is not everyone’s choice of pastime — but if you need to know, here is Linotype Legacy. If you just watch one, try the first which shows you what you’d have to do every morning to get the machine ready to go.

See also Linotype.

There was never a flood, but there was always a constant drip, drip, drip as people moved from one side of the production desk to the other. The attraction of the money potential in a sales career would always call to some.* In fact you probably need to know more about book manufacturing to buy it well than to sell it!

The lure wasn’t altogether the money — there was the thrill. Buying print is safe and a little boring: selling it is almost 100% excitement — well, apart from the large part which is frustration, annoyance and disappointment. The thing about being a print sales representative is that there’s nothing to do if you don’t make it happen yourself. The thing about being a publishing production buyer is that there’s never any shortage of things to do: quite the opposite, employers being what they are there’s always far too much for one person to do in an eight-hour day. To quote an earlier post of mine about the job of a production buyer: “I used to compare the job to being in an enclosed room with fifty balloons floating in the air: your job was to prevent any one of the balloons hitting the floor”. Selling is more like blowing up balloons which all too often leak. It’s like flying a plane which you’ve never been in before and talking your way through it, while buying is more like taking a ride on a crowded bus. The sales rep starts every week with nothing on their desk — you get started calling, trying to make appointments so you can see someone and maybe make a start blowing up another balloon. This work pattern can actually be quite exhilarating. As a production manager, you’d get to your desk on a Monday morning already snowed under. Dealing with an overloaded desk does have its own rewards in exhilaration (satisfaction might be a better word) of a more modest kind.

It goes without saying that there are personalities which prefer one kind work to the other. My own excursion onto the sales side was relatively brief, but I’m glad I did it.


* You can’t really generalize, as situations varied about as much as you could imagine, but insofar as sales representatives were rewarded by commissions, they stood to earn well when they landed a bestseller or managed to discover “a dripping roast” — a customer who keeps on ordering year after year. But you’d have to keep the heat on under the roast to ensure that the juices kept flowing. And of course you’d have to plan for dry spells. In the office, as a buyer, you’d be getting your regular salary, and could no doubt anticipate a pension in the fullness of time. Probably a smaller paycheck, but regular, safe, and something you could almost forget about. Horses for courses.

Plagiarism isn’t right, of course, but it always seems to me that its seriousness depends on who’s doing it and why. A politician is found to have plagiarized when writing their master’s thesis? So what? They’ve no doubt done a lot worse. An undergraduate plagiarizes in their weekly essay? Give me a break: what else are they meant to do? Well of course mentioning a source might be the responsible thing to do, but we can’t expect him/her to refer exclusively to their own original research can we? A bestselling author is shown to have cribbed most of their latest from someone else’s unknown novel — well, now you’re talking. (Part of the mystery surrounding the recent manuscript theft case is that this is exactly what it looks like, but there’s no evidence that the alleged perpetrator did anything of the kind.)

Of course influence can look like plagiarism and even Shakespeare‘s been tarred with this brush. If an Andrew Lloyd Webber tune sounds a bit like Puccini, that’s one thing. If a Puccini piece sounds like Puccini that’s a horse of a different color. It’s a bit like an accent: hard to avoid. If you mistake someone on the street for a friend of yours, you can hardly accuse the stranger of trying to pass themselves off as your buddy.

Still, although things shade off into these grey areas, we are usually talking about more obvious behavior when we discuss plagiarism. Given that academic jobs are distributed on the basis of publication activity, it’s not altogether amazing that, in order to get tenure, one or two over-eager professors are willing to pass off the work of others as their own. This can be as simple as just copying a journal article, and changing the author’s name, though usually a little more effort is put into the task. The OUP Blog brings us an examination of six types of plagiarism.

The six headings under which they examine plagiarism are:

  • Paraphrasing (without mentioning the source)
  • Patchwork or mosaic copying
  • Verbatim quotation without acknowledgement
  • Source-based plagiarism (faking a good looking source)
  • Global plagiarism (passing off a copied work as yours)
  • Self-plagiarism

In one way one might think that self-plagiarism is fair enough — if I can’t use my own works who’s work can I trust? But of course the issue is providing a reference. By failing to mention that the authority you are referring to is actually yourself, you attempt to imply that other researchers think as you do. And of course any extensive undercover quotation from yourself is a sign of laziness.

During the Covid epidemic, they inform us, “globally, the similarity score for academic submissions rose from an average of 35.1% to an average of 49.6% across the two measured time periods. This includes a 31% rise in paraphrased content and a 39% rise in identically matched content.”

The post is written by Epigeum, Oxford’s online course provider. They tell us that they “offer a number of programmes on these subjects [avoiding plagiarism and other poor practices] that provide comprehensive training and can form part of a wider approach to research or academic integrity”.

Bit depressing that we live in a world where integrity has become something adults need to be trained in. Plagiarism is obviously surprisingly hard to avoid. Just keep your references up to date, OK?

For a particular instance, see Textbook withdrawn. Another earlier post, Plagiarism suit, looks at a self-publishing outbreak. Plagiarism tales a look at on-line plagiarism checkers as part of a discussion of another Oxford University Press manifestation.