In Search of Reliable Internet Measurement Data
Newspapers and magazines frequently report growth rates of Internet usage, number of users, hosts, and domains that seem to be beyond all expectations. Growth rates are expected to accelerate exponentially. However, Internet measurement data are anything thant reliable and often quite fantastic constructs, that are nevertheless jumped upon by many media and decision makers because the technical difficulties in measuring Internet growth or usage are make reliable measurement techniques impossible.
Equally, predictions that the Internet is about to collapse lack any foundation whatsoever. The researchers at the Internet Performance Measurement and Analysis Project (IPMA) compiled a list of news items about Internet performance and statistics and a few responses to them by engineers.
Size and Growth
In fact, "today's Internet industry lacks any ability to evaluate trends, identity performance problems beyond the boundary of a single ISP (Internet service provider, M. S.), or prepare systematically for the growing expectations of its users. Historic or current data about traffic on the Internet infrastructure, maps depicting ... there is plenty of measurement occurring, albeit of questionable quality", says K. C. Claffy in his paper Internet measurement and data analysis: topology, workload, performance and routing statistics (http://www.caida.org/Papers/Nae/, Dec 6, 1999). Claffy is not an average researcher; he founded the well-known Cooperative Association for Internet Data Analysis (CAIDA).
So his statement is a slap in the face of all market researchers stating otherwise. In a certain sense this is ridiculous, because since the inception of the ARPANet, the offspring of the Internet, network measurement was an important task. The very first ARPANet site was established at the University of California, Los Angeles, and intended to be the measurement site. There, Leonard Kleinrock further on worked on the development of measurement techniques used to monitor the performance of the ARPANet (cf. Michael and Ronda Hauben, Netizens: On the History and Impact of the Net). And in October 1991, in the name of the Internet Activities Board Vinton Cerf proposed guidelines for researchers considering measurement experiments on the Internet stated that the measurement of the Internet. This was due to two reasons. First, measurement would be critical for future development, evolution and deployment planning. Second, Internet-wide activities have the potential to interfere with normal operation and must be planned with care and made widely known beforehand. So what are the reasons for this inability to evaluate trends, identity performance problems beyond the boundary of a single ISP? First, in early 1995, almost simultaneously with the worldwide introduction of the World Wide Web, the transition of the stewardship role of the National Science Foundation over the Internet into a competitive industry (bluntly spoken: its privatization) left no framework for adequate tracking and monitoring of the Internet. The early ISPs were not very interested in gathering and analyzing network performance data, they were struggling to meet demands of their rapidly increasing customers. Secondly, we are just beginning to develop reliable tools for quality measurement and analysis of bandwidth or performance. CAIDA aims at developing such tools. "There are many estimates of the size and growth rate of the Internet that are either implausible, or inconsistent, or even clearly wrong", K. G. Coffman and Andrew, both members of different departments of AT & T Labs-Research, state something similar in their paper The Size and Growth Rate of the Internet, published in First Monday. There are some sources containing seemingly contradictory information on the size and growth rate of the Internet, but "there is no comprehensive source for information". They take a well-informed and refreshing look at efforts undertaken for measuring the Internet and dismantle several misunderstandings leading to incorrect measurements and estimations. Some measurements have such large error margins that you might better call them estimations, to say the least. This is partly due to the fact that data are not disclosed by every carrier and only fragmentarily available. What is measured and what methods are used? Many studies are devoted to the number of users; others look at the number of computers connected to the Internet or count IP addresses. Coffman and Odlyzko focus on the sizes of networks and the traffic they carry to answer questions about the size and the growth of the Internet. You get the clue of their focus when you bear in mind that the Internet is just one of many networks of networks; it is only a part of the universe of computer networks. Additionally, the Internet has public (unrestricted) and private (restricted) areas. Most studies consider only the public Internet, Coffman and Odlyzko consider the long-distance private line networks too: the corporate networks, the Intranets, because they are convinced (that means their assertion is put forward, but not accompanied by empirical data) that "the evolution of the Internet in the next few years is likely to be determined by those private networks, especially by the rate at which they are replaced by VPNs (Virtual Private Networks) running over the public Internet. Thus it is important to understand how large they are and how they behave." Coffman and Odlyzko check other estimates by considering the traffic generated by residential users accessing the Internet with a modem, traffic through public peering points (statistics for them are available through CAIDA and the National Laboratory for Applied Network Research), and calculating the bandwidth capacity for each of the major US providers of backbone services. They compare the public Internet to private line networks and offer interesting findings. The public Internet is currently far smaller, in both capacity and traffic, than the switched voice network (with an effective bandwidth of 75 Gbps at December 1997), but the private line networks are considerably larger in aggregate capacity than the Internet: about as large as the voice network in the U. S. (with an effective bandwidth of about 330 Gbps at December 1997), they carry less traffic. On the other hand, the growth rate of traffic on the public Internet, while lower than is often cited, is still about 100% per year, much higher than for traffic on other networks. Hence, if present growth trends continue, data traffic in the U. S. will overtake voice traffic around the year 2002 and will be dominated by the Internet. In the future, growth in Internet traffic will predominantly derive from people staying longer and from multimedia applications, because they consume more bandwidth, both are the reason for unanticipated amounts of data traffic.
Hosts
The Internet Software Consortium's Internet Domain Survey is one of the most known efforts to count the number of hosts on the Internet. Happily the ISC informs us extensively about the methods used for measurements, a policy quite rare on the Web. For the most recent survey the number of IP addresses that have been assigned a name were counted. At first sight it looks simple to get the accurate number of hosts, but practically an assigned IP address does not automatically correspond an existing host. In order to find out, you have to send a kind of message to the host in question and wait for a reply. You do this with the PING utility. (For further explanations look here: Art. PING, in: Connected: An Internet Encyclopaedia) But to do this for every registered IP address is an arduous task, so ISC just pings a 1% sample of all hosts found and make a projection to all pingable hosts. That is ISC's new method; its old method, still used by RIPE, has been to count the number of domain names that had IP addresses assigned to them, a method that proved to be not very useful because a significant number of hosts restricts download access to their domain data. Despite the small sample, this method has at least one flaw: ISC's researchers just take network numbers into account that have been entered into the tables of the IN-ADDR.ARPA domain, and it is possible that not all providers know of these tables. A similar method is used for Telcordia's Netsizer.
Internet Weather
Like daily weather, traffic on the Internet, the conditions for data flows, are monitored too, hence called Internet weather. One of the most famous Internet weather report is from The Matrix, Inc. Another one is the Internet Traffic Report displaying traffic in values between 0 and 100 (high values indicate fast and reliable connections). For weather monitoring response ratings from servers all over the world are used. The method used is to "ping" servers (as for host counts, e. g.) and to compare response times to past ones and to response times of servers in the same reach.
Hits, Page Views, Visits, and Users
Let us take a look at how these hot lists of most visited Web sites may be compiled. I say, may be, because the methods used for data retrieval are mostly not fully disclosed. For some years it was seemingly common sense to report requested files from a Web site, so called "hits". A method not very useful, because a document can consist of several files: graphics, text, etc. Just compile a document from some text and some twenty flashy graphical files, put it on the Web and you get twenty-one hits per visit; the more graphics you add, the more hits and traffic (not automatically to your Web site) you generate. In the meantime page views, also called page impressions are preferred, which are said to avoid these flaws. But even page views are not reliable. Users might share computers and corresponding IP addresses and host names with others, she/he might access not the site, but a cached copy from the Web browser or from the ISP's proxy server. So the server might receive just one page request although several users viewed a document.
Especially the editors of some electronic journals (e-journals) rely on page views as a kind of ratings or circulation measure, Rick Marin reports in the New York Times. Click-through rates - a quantitative measure - are used as a substitute for something of intrinsically qualitative nature: the importance of a column to its readers, e. g. They may read a journal just for a special column and not mind about the journal's other contents. Deleting this column because of not receiving enough visits may cause these readers to turn their backs on their journal. More advanced, but just slightly better at best, is counting visits, the access of several pages of a Web site during one session. The problems already mentioned apply here too. To avoid them, newspapers, e.g., establish registration services, which require password authentication and therefore prove to be a kind of access obstacle. But there is a different reason for these services. For content providers users are virtual users, not unique persons, because, as already mentioned, computers and IP addresses can be shared and the Internet is a client- server system; in a certain sense, in fact computers communicate with each other. Therefore many content providers are eager to get to know more about users accessing their sites. On-line registration forms or WWW user surveys are obvious methods of collecting additional data, sure. But you cannot be sure that information given by users is reliable, you can just rely on the fact that somebody visited your Web site. Despite these obstacles, companies increasingly use data capturing. As with registration services cookies come here into play.
For
If you like to play around with Internet statistics instead, you can use Robert Orenstein's Web Statistics Generator to make irresponsible predictions or visit the Internet Index, an occasional collection of seemingly statistical facts about the Internet.
Measuring the Density of IP Addresses
Measuring the Density of IP Addresses or domain names makes the geography of the Internet visible. So where on earth is the most density of IP addresses or domain names? There is no global study about the Internet's geographical patterns available yet, but some regional studies can be found. The Urban Research Initiative and Martin Dodge and Narushige Shiode from the Centre for Advanced Spatial Analysis at the University College London have mapped the Internet address space of New York, Los Angeles and the United Kingdom ( http://www.geog.ucl.ac.uk/casa/martin/internetspace/paper/telecom.html and http://www.geog.ucl.ac.uk/casa/martin/internetspace/paper/gisruk98.html). Dodge and Shiode used data on the ownership of IP addresses from RIPE, Europe's most important registry for Internet numbers.
|
TEXTBLOCK 1/7 // URL: http://world-information.org/wio/infostructure/100437611791/100438658352
|
|
Intellectual Property and the "Information Society" Metaphor
Today the talk about the so-called "information society" is ubiquitous. By many it is considered as the successor of the industrial society and said to represent a new form of societal and economical organization. This claim is based on the argument, that the information society uses a new kind of resource, which fundamentally differentiates from that of its industrial counterpart. Whereas industrial societies focus on physical objects, the information society's raw material is said to be knowledge and information. Yet the conception of the capitalist system, which underlies industrial societies, also continues to exist in an information-based environment. Although there have been changes in the forms of manufacture, the relations of production remain organized on the same basis. The principle of property.
In the context of a capitalist system based on industrial production the term property predominantly relates to material goods. Still even as in an information society the raw materials, resources and products change, the concept of property persists. It merely is extended and does no longer solely consider physical objects as property, but also attempts to put information into a set of property relations. This new kind of knowledge-based property is widely referred to as " intellectual property". Although intellectual property in some ways represents a novel form of property, it has quickly been integrated in the traditional property framework. Whether material or immaterial products, within the capitalist system they are both treated the same - as property.
|
TEXTBLOCK 2/7 // URL: http://world-information.org/wio/infostructure/100437611725/100438659429
|
|
Virtual body and data body
The result of this informatisation is the creation of a virtual body which is the exterior of a man or woman's social existence. It plays the same role that the physical body, except located in virtual space (it has no real location). The virtual body holds a certain emancipatory potential. It allows us to go to places and to do things which in the physical world would be impossible. It does not have the weight of the physical body, and is less conditioned by physical laws. It therefore allows one to create an identity of one's own, with much less restrictions than would apply in the physical world.
But this new freedom has a price. In the shadow of virtualisation, the data body has emerged. The data body is a virtual body which is composed of the files connected to an individual. As the Critical Art Ensemble observe in their book Flesh Machine, the data body is the "fascist sibling" of the virtual body; it is " a much more highly developed virtual form, and one that exists in complete service to the corporate and police state."
The virtual character of the data body means that social regulation that applies to the real body is absent. While there are limits to the manipulation and exploitation of the real body (even if these limits are not respected everywhere), there is little regulation concerning the manipulation and exploitation of the data body, although the manipulation of the data body is much easier to perform than that of the real body. The seizure of the data body from outside the concerned individual is often undetected as it has become part of the basic structure of an informatised society. But data bodies serve as raw material for the "New Economy". Both business and governments claim access to data bodies. Power can be exercised, and democratic decision-taking procedures bypassed by seizing data bodies. This totalitarian potential of the data body makes the data body a deeply problematic phenomenon that calls for an understanding of data as social construction rather than as something representative of an objective reality. How data bodies are generated, what happens to them and who has control over them is therefore a highly relevant political question.
|
TEXTBLOCK 3/7 // URL: http://world-information.org/wio/infostructure/100437611761/100438659695
|
|
Positions Towards the Future of Copyright in the "Digital Age"
With the development of new transmission, distribution and publishing technologies and the increasing digitalization of information copyright has become the subject of vigorous debate. Among the variety of attitudes towards the future of traditional copyright protection two main tendencies can be identified:
Eliminate Copyright
Anti-copyrightists believe that any intellectual property should be in the public domain and available for all to use. "Information wants to be free" and copyright restricts people's possibilities concerning the utilization of digital content. An enforced copyright will lead to a further digital divide as copyright creates unjust monopolies in the basic commodity of the "information age". Also the increased ease of copying effectively obviates copyright, which is a relict of the past and should be expunged.
Enlarge Copyright
Realizing the growing economic importance of intellectual property, especially the holders of copyright (in particular the big publishing, distribution and other core copyright industries) - and therefore recipients of the royalties - adhere to the idea of enlarging copyright. In their view the basic foundation of copyright - the response to the need to provide protection to authors so as to give them an incentive to invest the time and effort required to produce creative works - is also relevant in a digital environment.
|
TEXTBLOCK 4/7 // URL: http://world-information.org/wio/infostructure/100437611725/100438659711
|
|
Intellectual Property: A Definition
Intellectual property, very generally, relates to the output, which result from intellectual activity in the industrial, scientific, literary and artistic fields. Traditionally intellectual property is divided into two branches:
1) Industrial Property
a) Inventions b) Marks (trademarks and service marks) c) Industrial designs d) Unfair competition (trade secrets) e) Geographical indications (indications of source and appellations of origin)
2) Copyright
The protection of intellectual property is guaranteed through a variety of laws, which grant the creators of intellectual goods, and services certain time-limited rights to control the use made of their products. Those rights apply to the intellectual creation as such, and not to the physical object in which the work may be embodied.
|
TEXTBLOCK 5/7 // URL: http://world-information.org/wio/infostructure/100437611725/100438659434
|
|
Virtual cartels, oligopolistic structures
Global networks require global technical standards ensuring the compatibility of systems. Being able to define such standards makes a corporation extremely powerful. And it requires the suspension of competitive practices. Competition is relegated to the symbolic realm. Diversity and pluralism become the victims of the globalisation of baroque sameness.
The ICT market is dominated by incomplete competition aimed at short-term market domination. In a very short time, new ideas can turn into best-selling technologies. Innovation cycles are extremely short. But today's state-of-the-art products are embryonic trash.
According to the Computer and Communications Industry Association, Microsoft is trying to aggressively take over the network market. This would mean that AT&T would control 70 % of all long distance phone calls and 60 % of cable connections.
AOL and Yahoo are lone leaders in the provider market. AOL has 21 million subscribers in 100 countries. In a single month, AOL registers 94 million visits. Two thirds of all US internet users visited Yahoo in December 1999.
The world's 13 biggest internet providers are all American.
AOL and Microsoft have concluded a strategic cross-promotion deal. In the US, the AOL icon is installed on every Windows desktop. AOL has also concluded a strategic alliance with Coca Cola.
|
TEXTBLOCK 6/7 // URL: http://world-information.org/wio/infostructure/100437611709/100438658963
|
|
Transparent customers. Direct marketing online
This process works even better on the Internet because of the latter's interactive nature. "The Internet is a dream to direct marketers", said Wil Lansing, CEO of the American retailer Fingerhut Companies. Many services require you to register online, requiring users to provide as much information about them as possible. And in addition, the Internet is fast, cheap and used by people who tend to be young and on the search for something interesting.
Many web sites also are equipped with user tracking technology that registers a users behaviour and preferences during a visit. For example, user tracking technology is capable of identifying the equipment and software employed by a user, as well as movements on the website, visit of links etc. Normally such information is anonymous, but can be personalised when it is coupled with online registration, or when personal identifcation has been obtained from other sources. Registration is often a prerequisite not just for obtaining a free web mail account, but also for other services, such as personalised start pages. Based on the information provided by user, the start page will then include advertisements and commercial offers that correspond to the users profile, or to the user's activity on the website.
One frequent way of obtaining such personal information of a user is by offering free web mail accounts offered by a great many companies, internet providers and web portals (e.g. Microsoft, Yahoo, Netscape and many others). In most cases, users get "free" accounts in return for submitting personal information and agreeing to receive marketing mails. Free web mail accounts are a simple and effective direct marketing and data capturing strategy which is, however, rarely understood as such. However, the alliances formed between direct advertising and marketing agencies on the one hand, and web mail providers on the other hand, such as the one between DoubleClick and Yahoo, show the common logic of data capturing and direct marketing. The alliance between DoubleClick and Yahoo eventually attracted the US largest direct marketing agency, Abacus Direct, who ended up buying DoubleClick.
However, the intention of collecting users personal data and create consumer profiles based on online behaviour can also take on more creative and playful forms. One such example is sixdegrees.com. This is a networking site based on the assumption that everybody on the planet is connected to everybody else by a chain of six people at most. The site offers users to get to know a lot of new people, the friends of their friends of their friends, for example, and if they try hard enough, eventually Warren Beatty or Claudia Schiffer. But of course, in order to make the whole game more useful for marketing purposes, users are encouraged to join groups which share common interests, which are identical with marketing categories ranging from arts and entertainment to travel and holiday. Evidently, the game becomes more interesting the more new people a user brings into the network. What seems to be fun for the 18 to 24 year old college student customer segment targeted by sixdegrees is, of course, real business. While users entertain themselves they are being carefully profiled. After all, data of young people who can be expected to be relatively affluent one day are worth more than money.
The particular way in which sites such as sixdegrees.com and others are structured mean that not only to users provide initial information about them, but also that this information is constantly updated and therefore becomes even more valuable. Consequently, many free online services or web mail providers cancel a user's account if it has not been uses for some time.
There are also other online services which offer free services in return for personal information which is then used for marketing purposes, e.g. Yahoo's Geocities, where users may maintain their own free websites, Bigfoot, where people are offered a free e-mail address for life, that acts as a relais whenever a customer's residence or e-mail address changes. In this way, of course, the marketers can identify friendship and other social networks, and turn this knowledge into a marketing advantage. People finders such as WhoWhere? operate along similar lines.
A further way of collecting consumer data that has recently become popular is by offering free PCs. Users are provided with a PC for free or for very little money, and in return commit themselves to using certain services rather than others (e.g. a particular internet provider), providing information about themselves, and agree to have their online behaviour monitored by the company providing the PC, so that accurate user profiles can be compiled. For example, the Free PC Network offers advertisers user profiles containing "over 60 individual demographics". There are literally thousands of variations of how a user's data are extracted and commercialised when online. Usually this happens quietly in the background.
A good inside view of the world of direct marketing can be gained at the website of the American Direct Marketing Association and the Federation of European Direct Marketing.
|
TEXTBLOCK 7/7 // URL: http://world-information.org/wio/infostructure/100437611761/100438659667
|
|
Cooperative Association of Internet Data Analysis (CAIDA)
Based at the University of California's San Diego Supercomputer Center, CAIDA supports cooperative efforts among the commercial, government and research communities aimed at promoting a scalable, robust Internet infrastructure. It is sponsored by the Defense Advanced Research Project Agency (DARPA) through its Next Generation Internet program, by the National Science Foundation, Cisco, Inc., and Above.net.
|
INDEXCARD, 1/7
|
|
Intellectual property
Intellectual property, very generally, relates to the output that result from intellectual activity in the industrial, scientific, literary and artistic fields. Traditionally intellectual property is divided into two branches: 1) industrial property ( inventions, marks, industrial designs, unfair competition and geographical indications), and 2) copyright. The protection of intellectual property is guaranteed through a variety of laws, which grant the creators of intellectual goods, and services certain time-limited rights to control the use made of their products.
|
INDEXCARD, 2/7
|
|
Vinton Cerf
Addressed as one of the fathers of the Internet, Vinton Cerf together with Robert Kahn developed the TCP/IP protocol suite, up to now the de facto-communication standard for the Internet, and also contributed to the development of other important communication standards. The early work on the protocols broke new ground with the realization of a multi-network open architecture.
In 1992, he co-founded the Internet Society where he served as its first President and later Chairman.
Today, Vinton Cerf is Senior Vice President for Internet Architecture and Technology at WorldCom, one of the world's most important ICT companies
Vinton Cerf's web site: http://www.wcom.com/about_the_company/cerfs_up/
http://www.isoc.org/
http://www.wcom.com/
|
INDEXCARD, 3/7
|
|
Core copyright industries
Those encompass the industries that create copyrighted works as their primary product. These industries include the motion picture industry (television, theatrical, and home video), the recording industry (records, tapes and CDs), the music publishing industry, the book, journal and newspaper publishing industry, and the computer software industry (including data processing, business applications and interactive entertainment software on all platforms), legitimate theater, advertising, and the radio, television and cable broadcasting industries.
|
INDEXCARD, 4/7
|
|
Cookie
A cookie is an information package assigned to a client program (mostly a Web browser) by a server. The cookie is saved on your hard disk and is sent back each time this server is accessed. The cookie can contain various information: preferences for site access, identifying authorized users, or tracking visits.
In online advertising, cookies serve the purpose of changing advertising banners between visits, or identifying a particular direct marketing strategy based on a user's preferences and responses.
Advertising banners can be permanently eliminated from the screen by filtering software as offered by Naviscope or Webwash
Cookies are usually stored in a separate file of the browser, and can be erased or permanently deactivated, although many web sites require cookies to be active.
http://www.naviscope.com/
http://www.webwash.com/
|
INDEXCARD, 5/7
|
|
Moral rights
Authors of copyrighted works (besides economic rights) enjoy moral rights on the basis of which they have the right to claim their authorship and require that their names be indicated on the copies of the work and in connection with other uses thereof. Moral rights are generally inalienable and remain with the creator even after he has transferred his economic rights, although the author may waive their exercise.
|
INDEXCARD, 6/7
|
|
Internet Protocol Number (IP Number)
Every computer using TCP/IP has a 32 bit-Internet address, an IP number. This number consists of a network identifier and of a host identifier. The network identifier is registered at and allocated by a Network Information Center (NIC), the host identifier is allocated by the local network administration.
IP numbers are divided into three classes. Class A is restricted for big-sized organizations, Class B to medium-sized ones as universities, and Class C is dedicated to small networks.
Because of the increasing number of networks worldwide, networks belonging together, as LANs forming a corporate network, are allocated a single IP number.
|
INDEXCARD, 7/7
|
|