"Marketing Pornography on the Information Superhighway"
July 2, 1995 (version 1.00)
Donna L. Hoffman & Thomas P. Novak
Associate Professors of Management
Co-Directors, Project 2000
Owen Graduate School of Management
Vanderbilt University
In this critique, we provide a detailed analysis of the recent article "Marketing Pornography on the Information Superhighway..." (Rimm 1995, Georgetown Law Journal, Volume 83, June, pp 1849-1934) that was also the subject of a recent Time cover story (Elmer- DeWitt, July 3, 1995). For a detailed critique of the Time article, see Hoffman & Novak (July 1, 1995, version 1.01). First, we offer general comments about the study. We criticize the study on conceptual, logical, and process grounds, including: 1) misrepresentation, 2) manipulation, 3) lack of objectivity, and 4) methodological flaws. Second, we provide a series of detailed examples that support our general conclusions of these four major difficulties. For ease of exposition, these specific comments follow the order of the article. Our objective with this note is to begin a constructive and open critique process. Thus, our note is not meant to be an exhaustive cataloging of the lapses, discrepancies, inconsistencies, and errors in this article, only a summary of those we consider to be among the most severe. We do not debate the existence of pornography in "cyberspace." Indeed, pornography exists and is transmitted through many media, including cable television, books and magazines, video tapes, private "adult" bulletin boards, the postal mail, computer networks, interactive multimedia like CDROM, fax, and telephone, to name a few. What we dispute are the findings presented in this study concerning its extent and consumption on what Rimm calls the "information superhighway." The critically important national debate over first amendment rights and restrictions on the Internet and other emerging media requires facts and informed opinion, not hysteria. The critique is also important because the Time cover story has given the Rimm study a credibility it does not deserve. If it is difficult for a professional journalist to evaluate the validity of such research, it is reasonable to assume that many others will have difficulties, as well. The general and specific comments that follow represent our professional opinion of the critical flaws and errors in the Rimm study. Our critique has benefitted from the impassioned discussions in the WELL Media conference (topic 1029), of this study and the larger debates of media responsibility and first amendment rights. ----------------------------------------------------------------- General comments * Misrepresentation * The study is positioned as "marketing pornography on the information superhighway." Yet, it deals neither with marketing nor the information superhighway, and displays a considerable lack of understanding of both areas. Out of 148 footnotes in Rimm's manuscript, only one (footnote 22) cites a reference to a marketing journal. For a study purporting to deal with "marketing pornography on the information superhighway," this demonstrates a blatant disregard and ignorance of the marketing literature. As "marketing" appears as the first word in the title of this manuscript, and the word "marketing" appears frequently throughout the manuscript, it is particularly disturbing that Rimm does not support any of his "marketing" insights with references to the marketing literature. * This is a study of descriptions of pornographic images on selected adult BBS in the United States. The author finds, not surprisingly, that adult BBSs contain pornography. While the author attempts to generalize beyond this domain to the "Information Superhighway," no generalization is possible, and the results of this study should not be used for this purpose. Unfortunately, the juxtaposition of unrelated analyses of adult BBSs and Usenet newsgroups may create in the casual reader's mind the impression that what is stated about adult BBSs is also true of the "Information Highway" as a whole. We caution, in the strongest possible terms, against such misinterpretation on the part of the casual reader. Rimm concludes his study by saying (p.1915) "These and other findings may assist policymakers and others concerned with the future of Cyberspace to make informed decisions, with reliable data, about the evolving Information Superhighway." Unfortunately, this paper provides no actionable insights for policymakers about the future of Cyberspace, as the results, at the maximum, can only be generalized to all adult BBSs in the United States. * The study is positioned as the product of a "research team" at Carnegie Mellon University. It is described throughout as the "Carnegie Mellon study" and it is frequently mentioned that the "research team" estimated this percentage or counted that set of items. Yet, nowhere is it mentioned that Rimm was an undergraduate electrical engineering student at CMU at the time the study was performed. Instead, Rimm is listed as a "Researcher" and "Principal Investigator." (Note that the four funding sources are identified only as coming from Carnegie, but the type and kind of grants these are is not revealed.) This positioning capitalizes upon the reputation of Carnegie Mellon University, lends an air of authority and credibility to the paper, and increases Rimm's own authority by association. The article is sole-authored. Not a single member of the extensive research team shared in the credit for the authorship of this paper. Given established standards of authorship as ownership of intellectual property in the academic and scientific community, we can only infer from this that no one on the "research team" felt their contributions merited the significance of shared authorship. * Manipulation * The study was not subjected to peer-review. The manuscript was deliberately "embargoed" for at least six months prior to publication, and was not made available to interested researchers. This is highly unusual. The paper was submitted to a law journal which is not peer- reviewed, despite the fact that it probably would be more appropriate in a behavioral science or public policy journal (most of which are peer-reviewed). Since law journals have no one on the board to evaluate the merits of the methodology and likely not even the distinctions among BBSs, Usenet news groups, the Web, and the Internet, we offer the following hypothesis: did Rimm place his article somewhere where it would appear credible and go unchallenged? At some point, an agreement was negotiated in which Time magazine obtained an advance copy of the manuscript in exchange for an "exclusive." This was used in preparation of the July 3, 1995 Time cover story written by Philip Elmer-DeWitt. Given the vast array of conceptual, logical, and methodological flaws in this study, documented thoroughly below, Time magazine behaved irresponsibly in accepting the statements made by Rimm in his manuscript at face value. Time had a responsibility to its readers to do its own peer reviewing, despite the embargo. Indeed, Time reporters were made aware that the study appeared to have serious conceptual, logical, and methodological flaws that Time needed to investigate prior to reporting its story. If Time was not able to evaluate the manuscript on its own, Time should have held the story until the manuscript was publicly available, so that expert opinion could have been solicited, or sought its own panel of objective experts for a "private" peer review. In this way, Time would likely have recognized the study for what it was and not what it purported to be and prepared a balanced, critical report on the subject of digital pornography. Instead, Time presented, around lurid and sensationalistic art, an uncritical and unquestioning report on "cyberporn" based on Rimm's flawed study. This has had the extremely unfortunate effect of giving the study an instant credibility that is not warranted nor deserved and fueling the growing movement toward first amendment restrictions and censorship. * The study appears to be driven by an underlying political agenda. It is difficult to read the paper in its entirety and not come away with the conclusion that it is written in a manner which provides policymakers with the ammunition they need to obtain support for legislation that would censor certain types of information on the Internet and other emerging media. * Lack of Objectivity * Rimm makes numerous unsubstantiated causal statements. These causal statements are not supported by the data. In many cases, the causal statements are inflammatory and outrageous. Sometimes they are ridiculous. Additionally, data are often interpreted in a biased and selective manner. * Methodological Flaws * The article is rife with methodological flaws, several of them extremely serious. The origins of many numbers presented in the article are difficult, if not impossible, to determine. Much greater attention is paid to sensationalistic and inflammatory descriptions of image files, for example, than accurate descriptions of survey methodology. In fact, in many cases important aspects of the methodology are simply not described at all. Methodological details are either omitted entirely or presented in such sparse detail that it is impossible for other researchers to 1) determine what Rimm actually did and 2) replicate the results. * The study contains numerous discrepancies that cannot be resolved and raises a series of fundamental procedural, analytic, and implementation questions that can only be addressed outside of the article itself. * Operational definitions of "pornography" are ad-hoc, inconsistent, and misleading. * Much of the data presented is consistently misinterpreted, particularly the Usenet data. * The paper describes the results in a confusing manner which makes it very difficult to determine what Rimm actually did. The manuscript is way too long and rambles. It is organized in such a manner as to obscure the methodological issues. This makes it difficult for the casual reader to draw his or her own conclusions about the merits of particular results from the study. For example, discussions of Usenet readership at a single university are interwoven with worldwide Usenet readership statistics. This is confusing and makes it easier to misinterpret his results, thinking that he might be talking about Usenet in general when in fact he is only talking about readership at a single university. Definitions of online media are similarly presented in such a way that the reader is likely to draw the conclusion that BBSs, Usenet news groups, the World Wide Web, and the Information Superhighway are all one and the same, and what applies in one domain, is relevant to all. * Rimm makes numerous unsubstantiated leaps of faith in his logical arguments. * The research methodology is not up to the rigorous standards of a peer-reviewed journal. * The study procedure raises a number of troubling ethical questions. ------------------------------------------------------------------ Specific Comments Title and acknowledgement (p. 1849) * The article's title states it concerns the marketing of pornography on the so-called "information superhighway,". yet it appears in a law journal that is, by custom, not rigorously peer-reviewed. The acknowledgement indicates that organizations and experts in pornography were consulted (but not listed), but no organizations or experts conversant in marketing research, survey methodology, and marketing on the Internet and related online markets appear to have been consulted. I. Overview (pp. 1849-1864) * Rimm: "The ... study adopts the "definition" utilized in current everyday practice by computer pornographers. Accordingly, "pornography" is defined here to include the depiction of actual sexual contact [hereinafter "hard-core"] and depiction of mere nudity or lascivious exhibition [hereinafter "soft-core"]...Accordingly, data was (sic) collected for this article only from bulletin board systems (BBS) which clearly marketed their image portfolios as "adult" rather than "artistic." Any BBS or World Wide Web site which made even a modest attempt to promote itself as "artistic" or "informational" was excluded." (fn. 1) Rimm's definition of pornography is central to his study. It is therefore reasonable to expect a detailed analysis of what pornography is, along with arguments for how it may be defined and measured. Such discussion would include the advantages and disadvantages of each measurement approach and lead to a reasoned position of the operational definition employed in the study. Because the results of his study depend on his definition and measurement scheme, it is surprising that the definition he proposes is so weakly supported, and fluid besides. For example, in analysis to follow, Usenet newsgroups appear to be classified as "pornographic" if they contain the word "sex" in the title (except for alt.safe.sex), or if he judges them to be so. Further, the footnote is misleading, because it implies that Rimm studied the Web with the same energy that he applied to adult BBSs, when in fact, he only searched the Web in order to locate and provide a simple count of sites judged to be "sex-related." (Appendix C, p. 1923 ff). * Rimm: It is essential to note that Usenet and the World Wide Web are merely different protocols." (p. 1869) This statement is erroneous and suggests a disturbing misunderstanding of the nature of online media, particularly as they relate to consumers and providers. For definitions and discussion, see Hoffman and Novak's paper on Marketing in Computer-Mediated Environments (http://www2000.ogsm.vanderbilt.edu/cme.conceptual.foundations.html) * Rimm: "...this article discusses only the content and consumption patterns of sexual imagery currently available on the Internet and "adult" BBS..." (fn. 2) This statement is misleading because, in fact, the article discusses the content analysis of descriptive listings of images obtained from adult BBSs and the readership data from selected Usenet newsgroups. Usenet readership data can only tell that a Usenet group was accessed, but does not tell if any text files were read or any images were downloaded. * Rimm: "Every time consumers log on, their transactions assist pornographers in compiling databases of information about their buying habits and sexual tastes. The more sophisticated computer pornographers are using these databases to develop mathematical models to determine which images they should try to market aggressively." (p. 1850-51) Every time consumers log on to what? In the final analysis the article provides very little evidence, other than anecdotal or case study, to support the idea that pornographers are engaging in such activities. * Rimm: "Computer pornographers are also moving from a market saturation policy to a market segmentation, or even individualized, marketing phase." (p. 1851) The statement is misleading because it implies that pornographers have a strategic policy which is now shifting. However, the article supplies no evidence of the original policy, let alone the shift to a market segmentation strategy. * Rimm: "It is clear that pornography is being vigorously marketed in increasingly sophisticated ways and has now found a receptive audience in a wide variety of computer environments." (p. 1852) The article supplies no evidence that pornography is being "vigorously marketed," nor does it define marketing. The study does not investigate audience receptivity in a "wide variety of computer environments." Instead, it studies download records from selected adult BBSs in the United States and Usenet postings (but not Usenet downloads). Thus, this conclusion cannot be supported from the research presented in this paper. * Rimm: "'Information Superhighway' and 'Cyberspace' are used to refer to any of the following: Internet, Usenet, World Wide Web, BBS, other multimedia telephone, computer, and cable networks." (fn. 7) These two definitions are misleading and do not conform to commonly understood meanings of the terms by researchers and experts in the field. * Rimm: In the top paragraph on page 1853, Rimm argues that his study is the first to systematically examine "pornography on the Information Superhighway," and that it is now possible to obtain "vast amounts of information about the distribution and consumption" of pornography on a much larger scale than previously possible. Are there any previous studies of pornography on the "Information Superhighway," even if unsystematic? In what ways does this study have to do with the "Information Superhighway?" A framework should be developed for adult BBSs - the focus of this study - in the context of the "information superhighway." For example, what percent of traffic do adult BBSs represent of the total "highway?" What percent of users of the "highway" use adult BBSs? What do the distributions look like nationally and internationally? And so on. * Rimm: "...it maybe be difficult for researchers to repeat this study, as much valuable data is no longer publicly available." (fn. 9) This is an astonishing and intellectually suspect statement, almost transparent in its effort to set up a case that this study cannot be falsified. If subsequent research shows disagreement with the results of this study, Rimm can discount such results by saying that it could not be repeated anyway. Instead, good scientific practice demands Rimm work to show how the study can be replicated by subsequent researchers. However, as analysis below argues, even the analyses here cannot be replicated because Rimm provides no details of methodology which would enable that to happen. * Rimm: The first full paragraph on page 1853 discusses the 917,410 "pornographic" items downloaded 8.5 million times that form the bulk of the study.) Subsequent sections of the paper show that this paragraph is misleading in the extreme, as is the article title. The title of the article suggests the research will concern a "survey of 917,410 images, descriptions, short stories and animations downloaded 8.5 million times." (Note that Rimm does not perform "survey research" in this study, as no one is surveyed.) On page 1853, the 917,410 items are broken down as: * 450,620 items downloaded 6.4 million times from 68 adult BBSs * 75,000 items with an unspecified number of downloads from 6 adult BBSs * 391,790 items with no download information from 7 adult BBSs These items include images, animations, and text files. Rimm says that 10,000 "actual images" were "randomly downloaded" from adult BBSs, the Usenet or CD-ROM and used to verify the accuracy of the descriptive listings. Rimm does not, however, 1) report the methodology used to randomly select the images, 2) provide frequency distributions of the images across the media they were obtained from, 3) specify the exact media used to obtain the listings (e.g. which CDROMS?), nor 4) indicate how the accuracy verification procedure was performed. In footnote 10 on page 1853, Rimm says the original number of downloads was counted at 6.4 million and that "a total of 5.5 million downloads are analyzed here." (emphasis ours). He explains that the other "0.9 million concern animations, text, and other miscellaneous files" which he presumably excluded from analysis. He continues that "an additional 2.1 million downloads was later obtained from...Amateur Action BBS. In this way, the total number of downloads tabulated is 8.5 million." We note that this tabulation of 8.5 million downloads is misleading for two reasons: 1) Rimm did not specify the period of time in which the 8.5 million downloads accumulated. Was it one month? One year? Five years? Ten years? 2) Rimm did not actually analyze 8.5 million unique downloads, as at least some were apparently excluded from analysis. While 8.5 million exposures to pornographic images may sound like a large number, let us put it into perspective. Suppose a pornographic newsstand magazine had a circulation of 500,000, including subscriptions, newsstand sales and pass-along readership. If there were 10 pornographic photographs in a single issue of this magazine, there would be 5 million "exposures" in this single issue alone. Thus, 8.5 million must be set in a context which specifies the time period, and the equivalent exposures in "competing" media during this time period. * Rimm: "A total of 292,114 image descriptions remained and are discussed here. At least 36% of the images studied were identified as having been distributed by two or more "adult" BBS." (p. 1854) Apparently, Rimm analyzed 292,114 descriptive listings of images only, presumably representing 5.5 million downloads. No indication is given of how duplicates were identified as such, nor distributed across the listings, either by individual adult BBS or by, for example, geographic region. In footnote 11 on page 1854, Rimm suggests that whatever method was used to identify duplicates had its validity confirmed by randomly sampling 100 "suspected duplicates," and presumably examining them. Yet, he does not indicate how he "suspected" them in the first place, how they were sampled, and how the validity was "confirmed," as no details or statistics are provided to support the statements.) * Rimm: Part II of this article addresses three issues concerning pornography on the Usenet: (1) the percentage of all images available on Usenet that are pornographic; (2) the popularity of pornographic boards in comparison to non- pornographic boards, at both a university studied and worldwide; and (3) the origins of pornographic imagery on the Usenet. (p. 1854) Our critique of Part II will show that due to serious methodological flaws, the study does not, in fact, provide accurate data on these issues. * Rimm: All BBS data was (sic) collected in May and June 1994, unless otherwise noted. (p. 1855) In footnote 15 on page 1856, Rimm states that the study "tracks image repertoires over a fourteen-year period." No clarification is provided here or subsequently to reconcile the discrepancy between the two-month data collection period and the 14 years or to illuminate on how the study follows "image repertoires" longitudinally. * Rimm: "...this study focuses entirely upon what people actually consume, not what they say they consume; it thus provides a more accurate measure of actual consumption." (p. 1855) Rimm analyzes aggregate download counts of descriptive listings of images available on adult BBSs. Although download patterns would be expected to correlate with actual consumption (i.e. viewing), we do not know the extent to which individuals actually looked at the images (or, indeed, whether they looked at all). These limitations are not addressed in the study and no thoughtful discussion of the consumption experience is ever provided. Further, absolutely no download behavior on Usenet news groups was ever examined by Rimm. * Rimm: Because the data is (sic) in many respects exhaustive, statistical techniques and assumptions that are commonly invoked to impute general consumer behavior are not necessary for this dataset. Thus the research team considers the inferences drawn highly robust." (p. 1856) Are the data really exhaustive of adult BBS? Rimm does not provide evidence that the listings obtained from the BBSs represent a census. Further, the statement that statistics are not necessary for these data is astonishing. No evidence, statistical or otherwise, is ever provided in the article that the inferences drawn from these data are, indeed, "robust." * Rimm: "The ... study examine 917,410 images, image descriptions, short stories, and short films..." (fn. 15) Yet earlier on page 1854 and footnote 10 on page 1853, Rimm suggests he deleted all but the images from the database under consideration and retained 292,114 for "discussion." Thus, how many items did the study actually examine? * Rimm: "The study results suggest a tremendous rift between the sexual activities in which Americans claim to engage, as reported most recently by the study Sex in America, and the sexually explicit activities presented in images that many Americans consume." (p. 1857) This statement is misleading, because Rimm did not study individuals, but aggregate download counts of descriptive listings of images available on adult BBSs. The Sex in America study surveyed the general population, and did not examine individuals' consumption behavior as measured in downloads on adult BBSs in the United States. In other words, the two studies examine two completely different populations. Thus, there is no basis for the conclusion that a "tremendous rift" exists and the statement represents an "apples and oranges" comparison. * Rimm: "Among the ultimate findings of this study are that digitized pornographic images are widely circulated in all areas of the country and that due to market forces, digitized pornographic images treat themes...which are not otherwise widely available." (p.1857) The conclusion is not supported by the data because Rimm examined only downloads of pornography on adult BBSs and readership statistics of selected Usenet newsgroups. He did not examine the distribution or consumption of pornography, by category or otherwise, in other media, nor does he provide evidence from others' examination. Thus, there is no basis for the comparison. * Rimm: "One of the more intriguing questions raised by this study is whether the general population will demand the same types of imagery currently in high demand among computer users." (p. 1857) This statement is misleading. All computer users? Some computer users? How many are "demanding" it now? What types? Indeed, why would the general population be expected to exhibit the same types of preferences as subscribers to adult BBSs, which is the only group of "computer users" for which Rimm studied imagery? * Rimm: "The widespread availability of pornography on computer networks may have a profound effect on those who wish to utilize the emerging National Information Infrastructure for non- pornographic purposes." (p. 1858) This statement is blatantly biased. Rimm did not examine the extent of pornography on "computer networks" such as the Internet or online services, and provides no discussion, nor references to balanced discussion of these issues. * Rimm: "While it may not be possible in the next decade for such technology to automatically classify images with the same precision as the Carnegie Mellon linguistic parsing software..." (fn. 21) The precision of the noted software is never established, let alone described in any detail. * Rimm: "More than two dozen faculty, staff, graduate and undergraduate students at Carnegie Mellon University contributed in some manner to this study." (p. 1861) Yet in fact, the article is a sole-authored study, performed when the author was an undergraduate student in Electrical Engineering, that was not subjected to the usual rigors of peer-review and revision that are common for this type of research. No one, other than Rimm, has accepted responsibility for the intellectual property in this study. Further, the individuals listed on page 1849 represent an acknowledgement by Rimm, rather than an endorsement by all of them of the manuscript. * Rimm: "After a year of exploring the Internet, Usenet, World Wide Web, and computer Bulletin Board Systems (BBS), the research team discovered that one of the largest (if not the largest) recreational applications of users of computer networks was the distribution and consumption of sexually explicit imagery." (p. 1861) As we continue to note, Rimm's study concerns download patterns on selected adult BBSs and readership statistics on selected Usenet newsgroups. Rimm may have explored these systems, but provides no evidence for the conclusion stated above. Further, Rimm's statement is misleading, as it implies that the largest recreational application is not just in downloads (i.e. "consumption"), but also in uploads (i.e. "distribution"). Rimm's study does not examine uploads. * Rimm: An unusual amount of data was (sic) freely available from commercial "adult" BBS, primarily as a consequence of the evolution of the online industry. Large commercial BBS such as American Online, Compuserve, and Prodigy do not carry hard-core pornographic imagery, either for legal or policy reasons. As a consequence, several thousand comparatively small "adult" BBS have sprung up across the country." (p. 1861) The statements are misleading because no evidence is provided to support the conclusion of a causal link between activities on commercial online services and adult BBSs. * Rimm: "In many instances, the research team was able to persuade the owners of these BBS to provide information about subscriber consumption habits." (p. 1862) This is a troubling statement. How was Rimm able to obtain such consent? Was it "informed consent?" Did Rimm provide full disclosure to these operators about the nature and objectives of his study? Did Rimm "debrief" them afterwards? Did he get the permission of the subscribers of these BBSs to examine information about their consumption habits? Did Rimm submit a proposal of his methodology for such "persuasion" to the University Human Subjects Committee? Did they approve the research and the methodology? II. Usenet (pp. 1865-1876) * Rimm: "This article will first discuss the methodology and results of the study of Usenet images and will then explain the methodology and results of the study of BBS images." (p. 1865) Footnote 28 (p. 1865) appended to this sentence refers the reader to footnotes 25-27 "for discussion of the distinction between the Usenet and commercial BBS." Such distinctions are critical for correct interpretation of Rimm's results and do not belong in footnotes. Nevertheless, examination of the footnotes reveals the following: footnote 25 "assumes that the reader has a basic understanding of Usenet and BBS." (p. 1862), and refers to reader to several books and a magazine; footnote 26 cites a 20-year old FCC document on "MTS and Wats," (p. 1863) and footnote 27 cites a brochure "on file with the Georgetown Law Journal" (p. 1864). * Pornographic vs nonpornographic imagery in the alt.binaries groups Rimm: Rimm states that he examined "[a]ll of the Usenet newsgroups with the prefix 'alt.binaries'" from September 21-September 27, 1994 and goes on to say that "[t]he number of new images posted each day was tabulated for both pornographic and non-pornographic newsgroups." (p. 1865) No rationale for excluding audio and text is provided other than they were "not the subject of this study." Does it make sense to look at all types of pornography on the Usenet and compare that to all other types of information? Rimm does not indicate how he determined which alt.binaries groups were pornographic and which were not. In what manner did Rimm control for duplicates, resent, or non-pornographic images? Did Rimm counts posts or a complete image? (Note that a single image could have up to 10 more files to make it complete.) In effect, what was the unit of analysis: a post or an image? On Saturday, 7/1/95, a colleague counted the number of posts on alt.binaries.pictures.erotica and found 1650 posts. One image was 41 posts long and represented 2.5% of the message volume alone. The article is moot on these important methodological details. * Popularity of Pornographic vs NonPornographic Usenet Newsgroups Rimm: "The research team was also able to examine the online habits of 4227 users at a mid- sized, private university in the northeast." (p. 1865) This raises troubling issues. How was Rimm able to conduct such examination? Did he obtain "informed consent" from each student? Did Rimm provide full disclosure to these students about the nature and objectives of his study? Did Rimm "debrief" them afterwards? Did the University Human Subjects Committee approve this examination? It is curious that Rimm argues in numerous places about the possible public policy implications of his work, but does not raise the ethical implications of conducting such research (only the implications of reporting it). See, for example, footnote 40 on page 1869, where he discusses his decision not to report "detailed demographics of the university population of computer pornography consumers" but makes no mention of whether it is appropriate to gather the data in the first place. * Rimm: In footnote 30 on page 1865, Rimm argues that the 11% of computer users at the private university "block" site statisticians from monitoring in order to "avoid detection" of their online activities. After discussing a behavioral analysis of child molesters, he proposes that "it is possible that some Internet users who block their accounts prefer sexual images of children and wish to avoid detection." This argument is one of the more outrageous in the paper and represents an invalid causal link. In the first place, there is no evidence that the 11% who "block" their activities are child molesters, and in the second place, there is no evidence that the 11% are representative of the broader population of Internet users. Thus, there is no basis for the proposal that Internet users who do not wish their activities monitored prefer to look at "sexual images of children." * Percent of pornographic imagery in Usenet binaries groups Rimm: "Among the pornographic newsgroups, 4206 image posts were counted, or 83.5% of the total posts." (p. 1867) The interpretation is incorrect and the number is grossly inflated. It is based upon 17 alt.binaries groups that Rimm considered "pornographic" and 15 alt.binaries groups that Rimm considered "non-pornographic." However, Rimm does not provide a listing of the names of these groups, no distributions of posts in these groups, and no methodological discussion of how he counted and determined posts were either pornographic or not, so there is no objective evidence of whether these groups are, in fact, "pornographic." Also, no information is provided on the degree to which these 32 groups comprise the complete universe of Usenet imagery. Further, as the methodology for counting the number of images is not specified, it is likely that even given Rimm's definitions and selection of 32 groups, the percentage is inflated due to the inclusion of non-pornographic next comments and multi-part images in the counts. What are the distributions of posts, by type of post (imagery, text, audio) in each of these newsgroups? What were the total numbers of posts to each group and to each set of groups and to Usenet overall during the period? How did Rimm determine that the 4206 image posts to the 17 supposed pornographic alt.binaries groups are, in fact, pornographic? A more accurate interpretation is that of 83.5% of the images posted to 32 alt.binaries newsgroups came from 17 groups that Rimm determined were pornographic. To make matters worse, Rimm grossly overgeneralizes his results in footnote 36 (p. 1868) and his summary (p 1914): "83.5% of all images posted on the Usenet are pornographic." This is a particularly misleading misinterpretation of his narrow result. * Misleading interpretation of "popularity" of types of Usenet newsgroups. Rimm: (p 1849) "'Pornography' is defined here to include the depiction of actual sexual contact...and depiction of mere nudity or lascivious exhibition." Rimm uses bold text to identify "newsgroups identified as having pornographic content" in Table 1 and Table 2. Included among pornographic newsgroups are "alt.sex" and "alt.binaries.pictures.supermodels." This is not consistent with Rimm's stated definition of pornography, as there is little of what would be considered pornographic content in these groups. It is a biased and inflammatory characterization of these Usenet groups. The column headings in Table 1 are not explained. Is the user base 4227 from page 1865 or some other number? This particular site receives only 3600 (p. 1870) of the 14,000 Usenet newsgroups (p. 1862) or only 25.7% of all groups. This seems like a small percentage of total groups. Is it? What do the percentages at other institutions look like? Without knowing this, it is difficult to generalize beyond this site to the entire Usenet domain. What would happen if we included data from the other 10,000+ sites? It is truly astonishing that there are no .comp or .news groups in the Top 40 Usenet news group at the university studied. Indeed, if the university is Carnegie Mellon, this is simply unbelievable. By this chart, only 99 readers are required in order to put it at number 40. Additionally, the Top 40 newsgroups in Table 1 differ dramatically from the Top 40 overall, according to the arbitron statistics. Rimm: In footnote 30 on page 1865, Rimm argues that "there is no reason to believe consumption at the university study differs from that of other universities from which pornographic Usenet newsgroups can be accessed." But, in fact, there are reasons to believe otherwise. A study conducted at Vanderbilt University (Varki 1995) as part of the requirements for a doctoral seminar on "Marketing in Computer-Mediated Environments" showed that the top Usenet news groups in terms of number of postings differs markedly and in important ways from the Top 40 list presented in Table 2 (p. 1872). Since this is a worldwide listing, intuition alone would suggest the likely presence of regional differences, at the least. In any event, no evidence is presented to support his reasoning in footnote 30. All of these problems suggest that the university in question may actually be fairly atypical in its use of Usenet newsgroups, which limits its generalizability. Rimm: "In broad terms, the research indicated that pornographic newsgroups are accessed more frequently during the school year than during summer recess. This suggests that, in comparison to teachers, faculty, and staff, a disproportionately large number of students access Usenet pornography." (p. 1969-1870) In fact, the conclusion does not follow since Rimm does not present evidence (e.g. counts, frequencies, and proportions) indicating how many students access pornography relative to the other groups. Rimm does not provide a version of Table 1 for the academic year, so that readers may draw their own conclusions. Rimm: "The fact that alt.sex.stories is currently more popular than alt.sex.pictures.binaries.erotica has been often misinterpreted as an indication that stories are more popular than images." (p. 1871) In fact, Rimm presents no evidence that such "misinterpretation" exists, although we can assume the interpretation exists. His alternative explanation is interesting, but no data are offered on how many users are discouraged by the level of technical sophistication required to access these groups. Indeed, a rival hypothesis is that these groups are accessed by a singularly technically sophisticated user, not the reverse. * Percentage of sites containing "pornographic" Usenet newsgroups. Rimm: (p 1871) "The worldwide statistics suggest that Usenet hosts appear less willing to offer their readers access to pornographic newsgroups than other types of newsgroups. 81.2% of the sites offer access to non- pornographic newsgroups, whereas only 55.8% of the sites offered their readers access to the pornographic newsgroups." In our opinion, Rimm has clearly misinterpreted the data. An examination of Tables 2 and 3 will immediately reveal that the important distinction is not between "pornographic" and "non-pornographic" groups, but between "alt" and "non-alt" hierarchies. All Rimm's "pornographic" groups are from the "alt" hierarchy. No alt group in Tables 2 and 3 is carried by more than 66% of sites. While alt.binaries.pictures.erotic (one of Rimm's "pornographic" groups) is carried by 53% of sites, alt.binaries.pictures (a "non-pornographic" group) is carried by only 49%, and alt.binaries.sounds.tv is carried by only 34% of sites. * Misleading portrayal of newsgroup readership. Rimm: (p 1873) "The newsgroups are ranked in Table 2 by the estimated total number of readers worldwide." Rimm identifies Brian Reid's "arbitron" script as the source of the data in Table 2. However, Rimm does not provide Reid's caveat on exactly what "readership" really means. Reid (Usenet Readership Summary Report for May 95) is careful to note that: "To 'read' a newsgroup means to have been presented with the opportunity to look at least one message in it." ... "Assuming that 'reading a group' is roughly the same as 'thumbing through a magazine', in that you don't necessarily have to read anything, but you have to browse through it and see what is there." This is a critical point. There is absolutely no information from Table 2 on how many of the 260,000 "readers" of alt.binaries.pictures.erotica actually downloaded and uudecoded a binary image file. The arbitron data is not tracking downloads. In fact, it would be completely consistent with Reid's definition of readership if none of the "readers" of alt.binaries.pictures.erotica ever saw a pornographic image. Thus, the results shown in Table 2 simply cannot be used to establish the exposure of "readers" to pornographic imagery. A reasonable hypothesis is that "readers" are simply curious about what is in these groups, and browse the titles to get some idea. As Rimm notes, decoding Usenet binaries requires a non-trivial degree of technical skill. We should further note that if one takes the estimate of individuals with Internet access as 20 million, then at most we are speaking of about .1% of Internet users accessing the alt.binaries.pictures.erotica newsgroup, and almost surely, the percentage actually downloading and uudecoding pornographic images is much lower than even this very low percentage. Rimm: In footnote 31 on page 1866, Rimm suggests that the (presumably total) number of readers of alt.binaries.pictures.erotica on Usenet is 260,000 per month. Rimm provides no discussion of the methodological details necessary to understand this estimate. How is this number estimated? How are multi-part image files counted? How are robot extractions handled? Are these 260,000 people unique? Or, could they possibly represent, for example, the same 9000 individuals per day for 30 days? How does the "arbitron" script keep track of individual users? In other words, are reach and frequency confounded? Does Rimm know? * Amount of pornography in Usenet groups. Rimm: "Of this 11.5%, approximately 3% [of messages on the Usenet] is associated with Usenet newsgroups containing pornographic imagery." (p. 1869) Rimm fails to take these traffic percentages to their logical conclusion, which is that less than 1/2 of 1% (3% of 11.5%) of the messages on the Internet are associated with newsgroups that contain pornographic imagery. Further, of this half percent, an unknown but even smaller percentage of messages in newsgroups that are "associated with pornographic imagery" actually contain pornographic material. Much of the material that is in these newsgroups is simply text files containing comments by Usenet readers. Rimm: (p 1873) "Moreover, 20,644 of the 101,211 monthly Usenet posts in the top forty newsgroups, or 20.4%, are pornographic. This figure is inflated and incorrect. Rimm is assuming that 100% of the content of the so-called "pornographic" newsgroups in Table 2 is pornography. But, this is obviously incorrect. A large number, if not the majority, of messages in these groups are simply text representing discussion and comments - not pornographic. In addition, large images are typically broken into multiple parts, so that one large .gif file might actually consist of ten or more physical files. Further, even single file images often have a separate descriptive header (which should be considered non-pornographic). While it is impossible to determine from the results Rimm has presented what proportion of monthly Usenet posts are "pornographic," we can safely conclude that the percentage is far below what Rimm states. * Origins of pornographic imagery on the Usenet. Rimm: (p 1874) "71%, or 1671 of the 2534 pornographic images downloaded from the five Usenet newsgroups studied over a four month period, originated from "adult" BBS." This is a critical percentage, yet we question its validity. Virtually no support is given for this percentage other than Rimm's in the text statement that 1671 images originated from adult BBSs. We cannot determine how Rimm arrived at this number from our reading of the manuscript. Is it an estimate? A count? How was it estimated or counted? Rimm lists the five Usenet newsgroups on which he says "[t]he largest selection of sexual imagery was discovered" at the northeastern university (p. 1866) and notes in footnote 32 (p. 1867) that these sites were the largest available at the "research site." He further says that between "April and July of 1994,...all available images (3254) [were downloaded] from these five newsgroups." (p. 1867) There must be a typographical error, because earlier Rimm stated that the alt.binaries groups were not examined until September of that year, so it cannot be possible that months earlier he was able to determine the groups with the largest selection of sexual imagery. The appearance of alt.sex.fetish.watersports is also confusing since it is not an alt.binaries group. It is possible that it all makes sense, but it is very difficult to sort out from the confusing exposition. We also wonder if group size and availability are confounded with amount of imagery. The main issue is that convincing evidence has not been presented that these five groups contain the largest selection of sexual imagery. Where did this list come from? These groups are at a single university site. Was a systematic analysis of all Usenet groups performed to generate this list? Did Rimm control for duplicates, resent, or non- pornographic images? Did Rimm counts posts or a complete image? What was the unit of analysis: a post or an image? Rimm states that the images from the five Usenet groups were classified into three categories (p. 1867): 1) images originating from adult BBSs ("the name, logo, and telephone number of the BBS appeared next to or within the image."; 2) pornographic images which did not originate from BBSs; 3) "PG/R" images ("no sexual contact or lascivious exhibition."). Curiously, there is no category for images were are not pornographic! Was every single image on these groups pornographic? Rimm does not indicate whether these categories are mutually exclusive; for example, how were "PG/R" images with a BBS logo counted? In any event, Rimm states (on page 1867) that there were "a total of 2830 images for analysis," but does not report the frequency of images in each of the three categories. (He states that 13% of the images could not be downloaded, which makes us wonder whether other figures presented need to be similarly adjusted to account for technical difficulties which must ultimately lower consumption rates.) However, seven pages later, the total number of pornographic images downloaded from the five groups shrinks to 2354 images, with no explanation! If we accept the 1671 as indicative of the number of images in those five groups that Rimm determined came from adult BBSs, then the percent of images originating is 59% (1671/2830) if we use his first number and 71% (1671/2354), if we use his second. Rimm: (p 1874-5) "For those who consider pornography to include the additional 476 'PG' or 'R' rated images defined in the methodology section, 59% of all Usenet images originate from 'adult' BBS." Since adding counts to the numerator of a fraction must increase the resulting percentage, 71% followed by 59% cannot be correct. Of course, perhaps this is a typographic error or an error of confusion. Was the 59% meant for the first percentage reported using the first (and larger) denominator? In that case, is the 71% also a typographical error or error of confusion? In any event, this additional calculation makes no sense unless all images in the third "PG/R" category were exclusively from adult BBSs. Thus, the percentage of 59%, which should be higher than 71% but must be a typo), is misleading until clarification can be provided! Note that the percentages are suspect (Which denominator is "right?" The larger? The smaller? Neither? Are 1671 and 476 even the correct quantities?), not only because of the confusing manner in which the information is presented, but also because of the more serious methodological criticisms made earlier about the selection of the five newsgroups on which these numbers are apparently based. III. Pornographic "Adult" Commercial BBS (pp. 1876-1905) * Number of adult BBSs examined Rimm: "...[T]he team either subscribed to, or logged on as a new user or guest, to a number of representative pornographic BBS and collected descriptive lists of the files offered by each." (p. 1876) Rimm reports that Boardwatch estimates that 5% of BBSs in the country are "adult," (fn. 35, p. 1867) but does not report a figure on the total number of BBSs, only that 5000 BBSs of any type were identified (p. 1877) and that 500 "active" adult boards were located for further study. Since this represents 10% of his list, we can assume that Rimm's list of BBSs was not complete. Rimm indicates that "most" of these 500 adult boards were "chat" boards, and still others were "transient." He gives no figures on how many comprised each category. Rimm: "To the best of the research team's knowledge, the BBS included in this study comprise most of the medium- and large-sized "adult" BBS in the country that existed at the time of the research." (p. 1877) Rimm does not indicate how many boards this represents, how they were sampled to be "representative" (p. 1876), whether the list of adult BBSs Rimm sampled from was exhaustive, or whether Rimm used his "judgment" in selecting BBSs or in generating the list of BBSs to sample from. * Number of descriptive lists examined Rimm: "This portion of the study analyzes a total of 450,620 files that are classified..." (p. 1876) Previously, Rimm indicated that 292,114 descriptive listings were retained for analysis. How many listings were actually collected? How many pornographic images do these listings represent? How many were movies? How many were text files? How many images were selected from the BBSs and how were they selected? Rimm indicates that both "descriptive lists" of pornographic images as well as a "representative sampling of the images themselves" were collected from the BBSs (p. 1877). Rimm does not say how many images were sampled, how they were sampled to be "representative," or what they were supposed to be representative of. * Ethical lapse? Rimm: "Members of the research team did not, as a rule, identify themselves as researchers." (p. 1878) As before, this is troubling. Why didn't Rimm identify himself and his research objectives to the operators? Did Rimm obtain permission to "collect" the information from the BBSs? Did Rimm provide full disclosure to these operators about the nature and objectives of his study? Did Rimm "debrief" them afterwards? Did he get the permission of the subscribers of these BBSs to examine information about their consumption habits and report the cities they lived in (see Appendix D: pages 1926-1934)? Did Rimm submit a proposal of his methodology for such to the University Human Subjects Committee? Did they approve the research and the methodology? Does Carnegie Mellon approve of publishing the cities that consumers of adult BBSs live in? How did Rimm obtain the demographic information on adult BBS subscribers? (as noted on p. 1895) * Results of the linguistic classification scheme Despite twelve pages of largely anecdotal discussion of the content analysis of the descriptive listings, the methodology is never once described formally, either in terms of the algorithm, or the software used to implement the algorithm. In the scholarly literature it is not only customary to offer the software to those who wish to replicate your results, for some journals it is mandatory (as is making the data available). Nowhere does Rimm indicate that the data or the software that categorized the listings are "available from the author." Validity and reliability are not established. This despite the fact that standard statistical procedures are available for determining reliability and validity. The few numbers that are presented in this section are either poorly defined or not defined at all. Other quantities are mentioned as being "high," but not reported (e.g. see footnote 70 in which Rimm asserts that "[t]he presentation of kappa values...was considered unnecessary because of the high level of reliability." Yet this "high level" is never reported). Elsewhere, Rimm suggests that "validity was high," (p. 1888), but it too is not reported statistically; or, Rimm states that he performed "a statistical analysis" on the data (p. 1894), but the type of analysis nor its results are not reported. Such examples, which render the statements they are intended to support, meaningless, are too numerous to catalog here. Relatedly, numbers or data are not reported that would help the reader understand the analysis, and numbers that are reported are pursued for additional insights. This section is ad hoc and weak; no reliable and valid conclusions can be drawn from the analysis as presented. Moreover, this is a standard content analysis problem. Content analysis has a large and rich literature, yet there is not a single citation to the either that vast literature or the related areas of AI software, and classification and categorization. As Rimm presents it, it is not possible to replicate the categorization he performed, let alone determine how he performed it. Thus, the methodology and this entire section, are impossible to evaluate. Numerous questions must be raised: What time period or periods are represented in the listings? Did Rimm control for time in his analyses? Were the data adjusted to account for differing lengths of time of each listing? For example, adjusting for date first posted on the BBS? Rimm's procedure implicitly assumes that all downloads are a function of consumer demand and no other variables. What about availability of certain kinds of images? The cost of the images? Their size? Consumer demographics? Rimm states that "[m]any BBS either hide [the listing] information from their customers or do not provide it." (pp. 1879-1880). But on page 1878, Rimm states that listings have a typical record structure which he diagrams in Figure 3. If the information is hidden by "many" operators (how many hid it in his study?), or many do not provide it (how many did not provide it in his study?), how did Rimm get it? Rimm suggests that operators were "persuaded" to provide the information "privately." (p. 1881) What does this mean? How were they persuaded? What is meant by "privately?" How valid are the sixty-three basic categories? Were the categories validated by judges? Did human beings ever look at any descriptions to validate the classification scheme? If so, how many? What exactly was the procedure the judges went through as part of the classification process? Rimm notes that "judges...were not avid consumers of pornography and thus did not recognize names of particular pornographic 'stars." (p. 1886) Did this lack of experience on the part of the judges affect or bias the classification procedure? Typically, judged are chosen for their expertise. How was the "final precedence scheme" in Figure 6 arrived at? Given that Appendix A describes dozens of categories, why are percentages not reported for individual categories within the major groupings. These percentages are important to know because some of the individual categories may be considered less extreme than others. Without knowing the distribution of categories of images within each broad group, it is difficult to know what the group actually represents. Rimm: Thus, analyzing the data in these four classes presents a highly reliable means of exploring the explosive growth of pornography on the Information Superhighway." (p. 1886) Rimm never shows that his method of analysis is "highly reliable" and Rimm never shows that the growth in pornography is "explosive," on the "information superhighway" or anywhere else. Page 1889 adds nothing to the reader's understanding of the methodology. What is the point of including this discussion? On page 1890, Rimm notes that there were 35 adult BBSs. How is this figure reconciled with the 68, 6 and 27 adult BBSs discussed on page 1853? What is the point of including the Amateur Action BBS Case Study (pp. 1896-1905)? In general, the conclusions Rimm makes are not supported by his analysis. Because the content analysis and classification scheme are "black boxes," because no reliability and validity results are presented, because no statistical testing of the differences both within and among categories for different types of listings has not been performed, and because not a single hypothesis has been tested, formally or otherwise, no conclusions should be drawn until the issues raised in this critique are resolved. IV. Conclusions Rimm: "[A]ttempts at regulation may be ineffective; "the net interprets censorship as damage and routes around it" is a well known expression among Usenet enthusiasts. The current structure of the Usenet requires that individual sites choose between an "All things not expressly permitted are prohibited policy," or conversely, an "All things not expressly prohibited are permitted." A middle ground does not appear viable. Curiously, Rimm does not consider the alternative of user-imposed, rather than state-imposed controls.