Data bunkers

Personal data are collected, appropriated, processed and used for commercial purposes on a global scale. In order for such a global system to operate smoothly, there a server nodes at which the data streams converge. Among the foremost of these are the data bases of credit card companies, whose operation has long depended on global networking.

On top of credit card companies such as Visa, American Express, Master Card, and others. It would be erroneous to believe that the primary purpose of business of these companies is the provision of credit, and the facilitation of credit information for sale transactions. In fact, Information means much more than just credit information. In an advertisement of 1982, American Express described itself in these terms: ""Our product is information ...Information that charges airline tickets, hotel rooms, dining out, the newest fashions ...information that grows money funds buys and sells equities ...information that pays life insurance annuities ...information that schedules entertainment on cable television and electronically guards houses ...information that changes kroners into guilders and figures tax rates in Bermuda ..."

Information has become something like the gospel of the New Economy, a doctrine of salvation - the life blood of society, as Bill Gates expresses it. But behind information there are always data that need to be generated and collected. Because of the critical importance of data to the economy, their possession amounts to power and their loss can cause tremendous damage. The data industry therefore locates its data warehouses behind fortifications that bar physical or electronic access. Such structures are somewhat like a digital reconstruction of the medieval fortress

Large amounts of data are concentrated in fortress-like structures, in data bunkers. As the Critical Art Ensemble argue in Electronic Civil Disobedience: "The bunker is the foundation of homogeneity, and allows only a singular action within a given situation." All activities within data bunker revolve around the same principle of calculation. Calculation is the predominant mode of thinking in data-driven societies, and it reaches its greatest density inside data bunkers. However, calculation is not a politically neutral activity, as it provides the rational basis - and therefore the formal legitimisation most every decision taken. Data bunkers therefore have an essentially conservative political function, and function to maintain and strengthen the given social structures.

TEXTBLOCK 1/8 // URL: http://world-information.org/wio/infostructure/100437611761/100438659754
 
Challenges for Copyright by ICT: Internet Service Providers

ISPs (Internet Service Providers) (and to a certain extent also telecom operators) are involved in the copyright debate primarily because of their role in the transmission and storage of digital information. Problems arise particularly concerning caching, information residing on systems or networks of ISPs at the directions of users and transitory communication.

Caching

Caching it is argued could cause damage because the copies in the cache are not necessarily the most current ones and the delivery of outdated information to users could deprive website operators of accurate "hit" information (information about the number of requests for a particular material on a website) from which advertising revenue is frequently calculated. Similarly harms such as defamation or infringement that existed on the original page may propagate for years until flushed from each cache where they have been replicated.

Although different concepts, similar issues to caching arise with mirroring (establishing an identical copy of a website on a different server), archiving (providing a historical repository for information, such as with newsgroups and mailing lists), and full-text indexing (the copying of a document for loading into a full-text or nearly full-text database which is searchable for keywords or concepts).

Under a literal reading of some copyright laws caching constitutes an infringement of copyright. Yet recent legislation like the DMCA or the proposed EU Directive on copyright and related rights in the information society (amended version) have provided exceptions for ISPs concerning particular acts of reproduction that are considered technical copies (caching). Nevertheless the exemption of liability for ISPs only applies if they meet a variety of specific conditions. In the course of the debate about caching also suggestions have been made to subject it to an implied license or fair use defense or make it (at least theoretically) actionable.

Information Residing on Systems or Networks at the Direction of Users

ISPs may be confronted with problems if infringing material on websites (of users) is hosted on their systems. Although some copyright laws like the DMCA provide for limitations on the liability of ISPs if certain conditions are met, it is yet unclear if ISPs should generally be accountable for the storage of infringing material (even if they do not have actual knowledge) or exceptions be established under specific circumstances.

Transitory Communication

In the course of transmitting digital information from one point on a network to another ISPs act as a data conduit. If a user requests information ISPs engage in the transmission, providing of a connection, or routing thereof. In the case of a person sending infringing material over a network, and the ISP merely providing facilities for the transmission it is widely held that they should not be liable for infringement. Yet some copyright laws like the DMCA provide for a limitation (which also covers the intermediate and transient copies that are made automatically in the operation of a network) of liability only if the ISPs activities meet certain conditions.

For more information on copyright (intellectual property) related problems of ISPs (BBSs (Bulletin Board Service Operators), systems operators and other service providers) see:

Harrington, Mark E.: On-line Copyright Infringement Liability for Internet Service Providers: Context, Cases & Recently Enacted Legislation. In: Intellectual Property and Technology Forum. June 4, 1999.

Teran, G.: Who is Vulnerable to Suit? ISP Liability for Copyright Infringement. November 2, 1999.

TEXTBLOCK 2/8 // URL: http://world-information.org/wio/infostructure/100437611725/100438659550
 
Feeding the data body

TEXTBLOCK 3/8 // URL: http://world-information.org/wio/infostructure/100437611761/100438659644
 
Biometric applications: surveillance

Biometric technologies are not surveillance technologies in themselves, but as identification technologies they provide an input into surveillance which can make such as face recognition are combined with camera systems and criminal data banks in order to supervise public places and single out individuals.

Another example is the use of biometrics technologies is in the supervision of probationers, who in this way can carry their special hybrid status between imprisonment and freedom with them, so that they can be tracked down easily.

Unlike biometric applications in access control, where one is aware of the biometric data extraction process, what makes biometrics used in surveillance a particularly critical issue is the fact that biometric samples are extracted routinely, unnoticed by the individuals concerned.

TEXTBLOCK 4/8 // URL: http://world-information.org/wio/infostructure/100437611729/100438658740
 
In Search of Reliable Internet Measurement Data

Newspapers and magazines frequently report growth rates of Internet usage, number of users, hosts, and domains that seem to be beyond all expectations. Growth rates are expected to accelerate exponentially. However, Internet measurement data are anything thant reliable and often quite fantastic constructs, that are nevertheless jumped upon by many media and decision makers because the technical difficulties in measuring Internet growth or usage are make reliable measurement techniques impossible.

Equally, predictions that the Internet is about to collapse lack any foundation whatsoever. The researchers at the Internet Performance Measurement and Analysis Project (IPMA) compiled a list of news items about Internet performance and statistics and a few responses to them by engineers.

Size and Growth

In fact, "today's Internet industry lacks any ability to evaluate trends, identity performance problems beyond the boundary of a single ISP (Internet service provider, M. S.), or prepare systematically for the growing expectations of its users. Historic or current data about traffic on the Internet infrastructure, maps depicting ... there is plenty of measurement occurring, albeit of questionable quality", says K. C. Claffy in his paper Internet measurement and data analysis: topology, workload, performance and routing statistics (http://www.caida.org/Papers/Nae/, Dec 6, 1999). Claffy is not an average researcher; he founded the well-known Cooperative Association for Internet Data Analysis (CAIDA).

So his statement is a slap in the face of all market researchers stating otherwise.
In a certain sense this is ridiculous, because since the inception of the ARPANet, the offspring of the Internet, network measurement was an important task. The very first ARPANet site was established at the University of California, Los Angeles, and intended to be the measurement site. There, Leonard Kleinrock further on worked on the development of measurement techniques used to monitor the performance of the ARPANet (cf. Michael and Ronda Hauben, Netizens: On the History and Impact of the Net). And in October 1991, in the name of the Internet Activities Board Vinton Cerf proposed guidelines for researchers considering measurement experiments on the Internet stated that the measurement of the Internet. This was due to two reasons. First, measurement would be critical for future development, evolution and deployment planning. Second, Internet-wide activities have the potential to interfere with normal operation and must be planned with care and made widely known beforehand.
So what are the reasons for this inability to evaluate trends, identity performance problems beyond the boundary of a single ISP? First, in early 1995, almost simultaneously with the worldwide introduction of the World Wide Web, the transition of the stewardship role of the National Science Foundation over the Internet into a competitive industry (bluntly spoken: its privatization) left no framework for adequate tracking and monitoring of the Internet. The early ISPs were not very interested in gathering and analyzing network performance data, they were struggling to meet demands of their rapidly increasing customers. Secondly, we are just beginning to develop reliable tools for quality measurement and analysis of bandwidth or performance. CAIDA aims at developing such tools.
"There are many estimates of the size and growth rate of the Internet that are either implausible, or inconsistent, or even clearly wrong", K. G. Coffman and Andrew, both members of different departments of AT & T Labs-Research, state something similar in their paper The Size and Growth Rate of the Internet, published in First Monday. There are some sources containing seemingly contradictory information on the size and growth rate of the Internet, but "there is no comprehensive source for information". They take a well-informed and refreshing look at efforts undertaken for measuring the Internet and dismantle several misunderstandings leading to incorrect measurements and estimations. Some measurements have such large error margins that you might better call them estimations, to say the least. This is partly due to the fact that data are not disclosed by every carrier and only fragmentarily available.
What is measured and what methods are used? Many studies are devoted to the number of users; others look at the number of computers connected to the Internet or count IP addresses. Coffman and Odlyzko focus on the sizes of networks and the traffic they carry to answer questions about the size and the growth of the Internet.
You get the clue of their focus when you bear in mind that the Internet is just one of many networks of networks; it is only a part of the universe of computer networks. Additionally, the Internet has public (unrestricted) and private (restricted) areas. Most studies consider only the public Internet, Coffman and Odlyzko consider the long-distance private line networks too: the corporate networks, the Intranets, because they are convinced (that means their assertion is put forward, but not accompanied by empirical data) that "the evolution of the Internet in the next few years is likely to be determined by those private networks, especially by the rate at which they are replaced by VPNs (Virtual Private Networks) running over the public Internet. Thus it is important to understand how large they are and how they behave." Coffman and Odlyzko check other estimates by considering the traffic generated by residential users accessing the Internet with a modem, traffic through public peering points (statistics for them are available through CAIDA and the National Laboratory for Applied Network Research), and calculating the bandwidth capacity for each of the major US providers of backbone services. They compare the public Internet to private line networks and offer interesting findings. The public Internet is currently far smaller, in both capacity and traffic, than the switched voice network (with an effective bandwidth of 75 Gbps at December 1997), but the private line networks are considerably larger in aggregate capacity than the Internet: about as large as the voice network in the U. S. (with an effective bandwidth of about 330 Gbps at December 1997), they carry less traffic. On the other hand, the growth rate of traffic on the public Internet, while lower than is often cited, is still about 100% per year, much higher than for traffic on other networks. Hence, if present growth trends continue, data traffic in the U. S. will overtake voice traffic around the year 2002 and will be dominated by the Internet. In the future, growth in Internet traffic will predominantly derive from people staying longer and from multimedia applications, because they consume more bandwidth, both are the reason for unanticipated amounts of data traffic.

Hosts

The Internet Software Consortium's Internet Domain Survey is one of the most known efforts to count the number of hosts on the Internet. Happily the ISC informs us extensively about the methods used for measurements, a policy quite rare on the Web. For the most recent survey the number of IP addresses that have been assigned a name were counted. At first sight it looks simple to get the accurate number of hosts, but practically an assigned IP address does not automatically correspond an existing host. In order to find out, you have to send a kind of message to the host in question and wait for a reply. You do this with the PING utility. (For further explanations look here: Art. PING, in: Connected: An Internet Encyclopaedia) But to do this for every registered IP address is an arduous task, so ISC just pings a 1% sample of all hosts found and make a projection to all pingable hosts. That is ISC's new method; its old method, still used by RIPE, has been to count the number of domain names that had IP addresses assigned to them, a method that proved to be not very useful because a significant number of hosts restricts download access to their domain data.
Despite the small sample, this method has at least one flaw: ISC's researchers just take network numbers into account that have been entered into the tables of the IN-ADDR.ARPA domain, and it is possible that not all providers know of these tables. A similar method is used for Telcordia's Netsizer.

Internet Weather

Like daily weather, traffic on the Internet, the conditions for data flows, are monitored too, hence called Internet weather. One of the most famous Internet weather report is from The Matrix, Inc. Another one is the Internet Traffic Report displaying traffic in values between 0 and 100 (high values indicate fast and reliable connections). For weather monitoring response ratings from servers all over the world are used. The method used is to "ping" servers (as for host counts, e. g.) and to compare response times to past ones and to response times of servers in the same reach.

Hits, Page Views, Visits, and Users

Let us take a look at how these hot lists of most visited Web sites may be compiled. I say, may be, because the methods used for data retrieval are mostly not fully disclosed.
For some years it was seemingly common sense to report requested files from a Web site, so called "hits". A method not very useful, because a document can consist of several files: graphics, text, etc. Just compile a document from some text and some twenty flashy graphical files, put it on the Web and you get twenty-one hits per visit; the more graphics you add, the more hits and traffic (not automatically to your Web site) you generate.
In the meantime page views, also called page impressions are preferred, which are said to avoid these flaws. But even page views are not reliable. Users might share computers and corresponding IP addresses and host names with others, she/he might access not the site, but a cached copy from the Web browser or from the ISP's proxy server. So the server might receive just one page request although several users viewed a document.

Especially the editors of some electronic journals (e-journals) rely on page views as a kind of ratings or circulation measure, Rick Marin reports in the New York Times. Click-through rates - a quantitative measure - are used as a substitute for something of intrinsically qualitative nature: the importance of a column to its readers, e. g. They may read a journal just for a special column and not mind about the journal's other contents. Deleting this column because of not receiving enough visits may cause these readers to turn their backs on their journal.
More advanced, but just slightly better at best, is counting visits, the access of several pages of a Web site during one session. The problems already mentioned apply here too. To avoid them, newspapers, e.g., establish registration services, which require password authentication and therefore prove to be a kind of access obstacle.
But there is a different reason for these services. For content providers users are virtual users, not unique persons, because, as already mentioned, computers and IP addresses can be shared and the Internet is a client-server system; in a certain sense, in fact computers communicate with each other. Therefore many content providers are eager to get to know more about users accessing their sites. On-line registration forms or WWW user surveys are obvious methods of collecting additional data, sure. But you cannot be sure that information given by users is reliable, you can just rely on the fact that somebody visited your Web site. Despite these obstacles, companies increasingly use data capturing. As with registration services cookies come here into play.

For

If you like to play around with Internet statistics instead, you can use Robert Orenstein's Web Statistics Generator to make irresponsible predictions or visit the Internet Index, an occasional collection of seemingly statistical facts about the Internet.

Measuring the Density of IP Addresses

Measuring the Density of IP Addresses or domain names makes the geography of the Internet visible. So where on earth is the most density of IP addresses or domain names? There is no global study about the Internet's geographical patterns available yet, but some regional studies can be found. The Urban Research Initiative and Martin Dodge and Narushige Shiode from the Centre for Advanced Spatial Analysis at the University College London have mapped the Internet address space of New York, Los Angeles and the United Kingdom (http://www.geog.ucl.ac.uk/casa/martin/internetspace/paper/telecom.html and http://www.geog.ucl.ac.uk/casa/martin/internetspace/paper/gisruk98.html).
Dodge and Shiode used data on the ownership of IP addresses from RIPE, Europe's most important registry for Internet numbers.





TEXTBLOCK 5/8 // URL: http://world-information.org/wio/infostructure/100437611791/100438658352
 
Private data bunkers

On the other hand are the data bunkers of the private sector, whose position is different. Although these are fast-growing engines of data collection with a much greater degree of dynamism, they may not have the same privileged position - although one has to differentiate among the general historical and social conditions into which a data bunker is embedded. For example, it can safely be assumed that the databases of a large credit card company or bank are more protected than the bureaucracies of small developing countries.

Private data bunkers include

    Banks

    Building societies

    Credit bureaus

    Credit card companies

    Direct marketing companies

    Insurance companies

    Telecom service providers

    Mail order stores

    Online stores


TEXTBLOCK 6/8 // URL: http://world-information.org/wio/infostructure/100437611761/100438659735
 
Late 1950s - Early 1960s: Second Generation Computers

An important change in the development of computers occurred in 1948 with the invention of the transistor. It replaced the large, unwieldy vacuum tube and as a result led to a shrinking in size of electronic machinery. The transistor was first applied to a computer in 1956. Combined with the advances in magnetic-core memory, the use of transistors resulted in computers that were smaller, faster, more reliable and more energy-efficient than their predecessors.

Stretch by IBM and LARC by Sperry-Rand (1959) were the first large-scale machines to take advantage of the transistor technology (and also used assembly language instead of the difficult machine language). Both developed for atomic energy laboratories could handle enormous amounts of data, but still were costly and too powerful for the business sector's needs. Therefore only two LARC's were ever installed.

Throughout the early 1960s there were a number of commercially successful computers (for example the IBM 1401) used in business, universities, and government and by 1965 most large firms routinely processed financial information by using computers. Decisive for the success of computers in business was the stored program concept and the development of sophisticated high-level programming languages like FORTRAN (Formular Translator), 1956, and COBOL (Common Business-Oriented Language), 1960, that gave them the flexibility to be cost effective and productive. The invention of second generation computers also marked the beginning of an entire branch, the software industry, and the birth of a wide range of new types of careers.

TEXTBLOCK 7/8 // URL: http://world-information.org/wio/infostructure/100437611663/100438659439
 
Product Placement

With television still being very popular, commercial entertainment has transferred the concept of soap operas onto the Web. The first of this new species of "Cybersoaps" was "The Spot", a story about the ups and downs of an American commune. The Spot not only within short time attracted a large audience, but also pioneered in the field of online product placement. Besides Sony banners, the companies logo is also placed on nearly every electronic product appearing in the story. Appearing as a site for light entertainment, The Spots main goal is to make the name Sony and its product range well known within the target audience.

TEXTBLOCK 8/8 // URL: http://world-information.org/wio/infostructure/100437611652/100438658026
 
Internet Software Consortium

The Internet Software Consortium (ISC) is a nonprofit corporation dedicated to the production of high-quality reference implementations of Internet standards that meet production standards. Its goal is to ensure that those reference implementations are properly supported and made freely available to the Internet community.

http://www.isc.org

INDEXCARD, 1/8
 
Neighboring rights

Copyright laws generally provide for three kinds of neighboring rights: 1) the rights of performing artists in their performances, 2) the rights of producers of phonograms in their phonograms, and 3) the rights of broadcasting organizations in their radio and television programs. Neighboring rights attempt to protect those who assist intellectual creators to communicate their message and to disseminate their works to the public at large.

INDEXCARD, 2/8
 
RSA

The best known of the two-key cryptosystems developed in the mid-1980s is the Rivest-Shamir-Adleman (RSA) cryptoalgorithm, which was first published in April, 1977. Since that time, the algorithm has been employed in the most widely-used Internet electronic communications encryption program, Pretty Good Privacy (PGP). It is also employed in both the Netscape Navigator and Microsoft Explorer web browsing programs in their implementations of the Secure Sockets Layer (SSL), and by Mastercard and VISA in the Secure Electronic Transactions (SET) protocol for credit card transactions.

INDEXCARD, 3/8
 
Calculator

Calculators are machines for automatically performing arithmetical operations and certain mathematical functions. Modern calculators are descendants of a digital arithmetic machine devised by Blaise Pascal in 1642. Later in the 17th century, Gottfried Wilhelm von Leibniz created a more advanced machine, and, especially in the late 19th century, inventors produced calculating machines that were smaller and smaller and less and less laborious to use.

INDEXCARD, 4/8
 
Royalties

Royalties refer to the payment made to the owners of certain types of rights by those who are permitted by the owners to exercise the rights. The rights concerned are literary, musical, and artistic copyright and patent rights in inventions and designs (as well as rights in mineral deposits, including oil and natural gas). The term originated from the fact that in Great Britain for centuries gold and silver mines were the property of the crown and such "royal" metals could be mined only if a payment ("royalty") were made to the crown.

INDEXCARD, 5/8
 
Alexander Graham Bell

b., March 3, 1847, Edinburgh

d. Aug. 2, 1922, Beinn Bhreagh, Cape Breton Island, Nova Scotia, Canada

American audiologist and inventor wrongly remembered for having invented the telephone in 1876. Although Bell introduced the first commercial application of the telephone, in fact a German teacher called Reiss invented it.

For more detailed information see the Encyclopaedia Britannica: http://www.britannica.com/bcom/eb/article/1/0,5716,15411+1+15220,00.html

INDEXCARD, 6/8
 
Robot

Robot relates to any automatically operated machine that replaces human effort, though it may not resemble human beings in appearance or perform functions in a humanlike manner. The term is derived from the Czech word robota, meaning "forced labor." Modern use of the term stems from the play R.U.R., written in 1920 by the Czech author Karel Capek, which depicts society as having become dependent on mechanical workers called robots that are capable of doing any kind of mental or physical work. Modern robot devices descend through two distinct lines of development--the early automation, essentially mechanical toys, and the successive innovations and refinements introduced in the development of industrial machinery.

INDEXCARD, 7/8
 
Machine language

Initially computer programmers had to write instructions in machine language. This coded language, which can be understood and executed directly by the computer without conversion or translation, consists of binary digits representing operation codes and memory addresses. Because it is made up of strings of 1s and 0s, machine language is difficult for humans to use.

INDEXCARD, 8/8