Cart
Free Shipping in Australia
Proud to be B-Corp

Web Farming for the Data Warehouse R.D. Hackathorn

Web Farming for the Data Warehouse By R.D. Hackathorn

Web Farming for the Data Warehouse by R.D. Hackathorn


Summary

Web farming describes the systematic discovery and acquisitions of Web contents as inputs into a data warehouse system. This book provides an introduction to this field. It acts as a tool for IT managers who are confronted with the need to implement a method for gathering data from the Internet.

Web Farming for the Data Warehouse Summary

Web Farming for the Data Warehouse by R.D. Hackathorn

Web Farming is an exciting new area emerging out of data warehousing and web technology. It is defined as systematic business intelligence by farming the information resources of the Web. The objective is to enhance the contents of a data warehousing system. Data warehouses are usually based upon the contents of internal operations databases. With Web Farming, the focus can be balanced with external business factors, dealing moment by moment with global changes in the business environment. Instead of surfing the Web haphazardly or gathering massive search results, Web Farming concentrates on an evolutionary process to systematically discover, acquire, structure, and disseminate content, constantly guided by business-critical intelligence to the enterprise. A four-stage methodology is suggested, along with growth strategy in the supporting architecture. Extensive coverage of standards, tools, and resources for Web Farming is given, along with an in-depth discussion of the important societal issues of privacy, confidentiality, intellectual property, and information espionage. This is the first book that focuses on the critical features of Web Farming. This book will appeal to both a technical and business audience. The technical audience is anyone interested in the use of Web technology for data warehouse development, including corporate IT professionals, database administrators, network administrators, and all others who are responsible for data warehouse and data mining. The business audience is anyone interested in establishing effective business intelligence, such as strategic planners, business development managers, competitive intelligence analysts, and market researchers.

Web Farming for the Data Warehouse Reviews

o:Frankly, the book is ahead of its time. I think that not only will it help readers think outside the proverbial box, but also give them the roadmap for implementing their own Web farming. Karen Watterson, data and knowledge warehouse design consultant o:What makes this book doubly useful, aside from the easy to read writing style, is that Richard has melded together the three biggest trends in our industry into a single strategy. Combining the internet, data warehousing, and knowledge management into one vision, Richard gives us insight into the next wave that will crash upon the industry ... I've been preaching this message to our customers only to find someone has written an entire book on it! Dan Graham, Strategy & Solutions Executive, IBM Global Business Intelligence Solutions

About R.D. Hackathorn

Dr. Richard D. Hackathorn is a well-known innovator and international educator in the information systems field with over thirty years of experience. In 1991 he founded Bolder Technology, Inc., to focus on professional education and technology innovation in the area of Enterprise Systems and Connectivity, serving professional clients including Lockheed Martin, Shell Oil, Microsoft Corporation, and Sybase. Dr. Hackathorn has published numerous articles and is the author of Enterprise Database Connectivity and co-author of Using the Data Warehouse.

Table of Contents

Contents; Preface; ptPart One: Plowing the Soil; Chapter 1: Motivation; 1.1 Nature of the Global Web; 1.1.1 The Paradigm; 1.1.2 The Dynamics; 1.1.3 The Diversity; 1.2 Combining the Web with Data Warehousing; 1.3 External Information for Business Intelligence; 1.4 The Objectives of Web Farming; 1.4.1 Why the Term Web Farming?; 1.4.2 Information Flow; 1.4.3 Reliability of Web Content; 1.4.4 A Step Toward Knowledge Management; 1.4.5 Rendezvous with the Data Warehouse; 1.5 Illustrations of Web Farming; 1.5.1 IBM's Grand Central Station; 1.5.2 Junglee Virtual Database; 1.5.3 Gaining Competitive Information; 1.5.4 Supporting the Call Center; 1.5.5 International Currency Monitoring; 1.5.6 Strategic Forecasting for a Manufacturing Conglomerate; 1.5.7 Monitoring Visa Applications for the U.K. 1.5.8 HP Manages Knowledge for Its Professional Services Organization; 1.5.9 IBM Offers a Window into Patent Information; 1.5.10 Supporting the Deregulation of Electric Power; Chapter 2: Perspectives; 2.1 A Sense of Urgency; 2.1.1 What Is an Enterprise?; 2.1.2 The Crisis in Enterprise Systems; 2.2 Leveraging Data into Knowledge; 2.2.1 Data, Information, and Knowledge; 2.2.2 Knowledge and Power; 2.2.3 Managing the Knowledge Asset; 2.3 Rethinking the Way We Do Work; 2.3.1 Molded by the Industrial Revolution; 2.3.2 The Value Chain; 2.3.3 Redesigning Business Processes; 2.3.4 Competitive Forces; 2.3.5 The Dark Side of Competitive Advantage; 2.3.6 Missed Opportunities; 2.3.7 The Precious Moment of Customer Contact; 2.3.8 Automating and Enabling; 2.4 Sharing Information; 2.4.1 Information as a Unique Resource; 2.4.2 Tyranny of Disparate Data; 2.4.3 Toward an Information Ecology; 2.5 Creating Information Markets; 2.5.1 Basic Paradigms of Computing; 2.5.2 Publish-and-Subscribe; 2.5.3 Virtual Organization; Chapter 3: Foundations; 3.1 Web Technology; 3.1.1 Internet, Intranet, and Extranet; 3.1.2 Web Browsers and Servers; 3.1.3 Web-Enabled Databases; 3.1.4 Web Applets; 3.2 Data Warehousing; 3.2.1 Consistent Image of Business Reality; 3.2.2 Operational versus Informational; 3.2.3 Architectures for Data Warehousing; 3.2.4 Flows in Data Warehousing; 3.2.5 Web Warehousing; 3.3 Information Science; 3.3.1 Information Structuring; 3.3.2 Recall versus Precision; 3.3.3Information Visualization; Chapter 4: Methodology; 4.1 Stages of Growth; 4.2 Stage One-Getting Started; 4.2.1 Document the Critical External Factors; 4.2.2 Formulate Discovery Plan; 4.2.3 Identify Content Providers; 4.2.4 Disseminate Information; 4.2.5 Compile a Business Case; 4.3 Stage Two-Getting Serious; 4.3.1 Legitimizing Web Farming; 4.3.2 Build the Infrastructure; 4.3.3 Refine the CEF List; 4.3.4 Maintain Historical Context; 4.3.5 Establish Intranet Site; 4.4 Stage Three-Getting Smart; 4.4.1 Build Selection and Extraction Filters; 4.4.2 Construct Pipelines to Primary Content Providers; 4.4.3 Analyze and Structure Content; 4.4.4 Publishing Content; 4.5 Stage Four-Getting Tough; 4.5.1 Rendezvous with the Warehouse; 4.5.2 Link to Other Systems; 4.5.3 Resolve Entity Mapping; 4.5.4 Establish Credibility Checks; 4.6 Then What?; Chapter 5: Architecture; 5.1 Stage One-Getting Started; 5.2 Stage Two-Getting Serious; 5.2.1 Transition into Data Center; 5.2.2 Common Database Server; 5.2.3 Content Probing and Feed Filters; 5.2.4 Intranet Resource Center; 5.2.5 New Position for System Administrator; 5.2.6 Managing Complexity; 5.2.7 Being Web-Farming Friendly; 5.3 Stage Three-Getting Smart; 5.3.1 Enhanced Control Database; 5.3.2 Integrated Analyst Workbench; 5.3.3 New Position for Agent Programmer; 5.3.4 Custom Pipeline to Content Provider; 5.3.5 Publish-and-Subscribe Delivery; 5.3.6 New Position for Content Broker; 5.4 Stage Four-Getting Tough; 5.4.1 Staging Area for the Warehouse; 5.4.2 The Validating and Loading Procedures; 5.4.3 New Position for a Data Administrator; 5.4.4 The Data Warehouse as the Resource Center; Chapter 6: Management; 6.1 Selling Web Farming to a Skeptical Management; 6.1.1 Deal with the Skepticism; 6.1.2 Focus on Business Issues; 6.1.3 Avoid the Technology Hype; 6.1.4 Deal with Data Quality; 6.1.5 Keep It Simple; 6.1.6 Build for the Long Term; 6.2 Organizational Designs; 6.3 The Position of Web Analyst; 6.3.1 Qualifications; 6.3.2 Career Opportunities for Research Librarians; 6.3.3 Career Opportunities for Intelligence Analysts; 6.4 Business Opportunities with Web Farming; 6.4.1 Content Provider; 6.4.2 System Integrator; 6.4.3 Tool Developer; 6.4.4 Education Provider; ptPart Three: Cultivating the Plants; Chapter 7: Standards; 7.1 Web Protocols; 7.1.1 TCP/IP Protocol Suite; 7.1.2 HyperText Transfer Protocol; 7.1.3 Uniform Resource Locator; 7.1.4 Domain Naming; 7.1.5 Uniform Resource Name; 7.1.6 MIME Data Types; 7.1.7 HyperText Markup Language; 7.1.8 HTML Anchor Tags; 7.1.9 HTML FORM Templates; 7.1.10 HTML META Tags; 7.1.11 Extensible Markup Language; 7.2 Metadata Standards; 7.2.1 ANSI/NISO Z39.50-1995; 7.2.2 Dublin Core and Warwick Framework; 7.2.3 Metadata Interchange Specifications; 7.2.4 Open Document Management API; 7.2.5 Resource Description Framework; 7.3 Standards Groups; 7.3.1 American National Standards Institute; 7.3.2 Internet Assigned Numbers Authority; 7.3.3 Internet Engineering Task Force; 7.3.4 Internet Society; 7.3.5 InterNIC; 7.3.6 National Committee for Information Technology Standards; 7.3.7 National Institute of Standards and Technology; 7.3.8 Organization for International Standards; 7.3.9 World Wide Web Consortium; 7.3.10 Professional Societies and Industry Consortiums; 7.4 Where To Now?; Chapter 8: Tools; 8.1 Generic Web Browsers; 8.2 Web Agents; 8.2.1 Alexa from Alexa Internet; 8.2.2 Copernic from Agents Technologies (MEB) Corp. 8.2.3 LiveAgent Pro from AgentSoft Ltd. 8.2.4 NetGetIt from Crossproduct Solutions; 8.2.5 Odyssey from General Magic; 8.2.6 Smart Bookmarks from FirstFloor; 8.2.7 WebCompass from QuarterDeck; 8.2.8 WebWhacker from Blue Squirrel; 8.2.9 Who's Talking from Software Solutions; 8.3 Hypertext Analysis and Transformation; 8.3.1 AltaVista Search from Digital Equipment Corp. 8.3.2 Cambio from Data Junction; 8.3.3 Compass Server from Netscape; 8.3.4 Dynamic Reasoning Engine from Neurodynamics; 8.3.5 EXTRACT Tool Suite from Evolutionary Technologies; 8.3.6 Index Server from Microsoft; 8.3.7 INTEGRITY from Valley Technology Inc. 8.3.8 Intelligent Miner for Text from IBM Corporation; 8.3.9 LinguistX from Inxight Software, Inc. 8.3.10 NetOwl Intelligence Server from IsoQuest Inc. 8.3.11 RetrievalWare from Excalibur Technologies; 8.3.12 Search '97 from Verity Inc. 8.3.13 SearchServer from Fulcrum Technologies; 8.3.14 SmartCrawl from Inktomi; 8.3.15 Ultraseek from InfoSeek Corporation; 8.3.16 Webinator from Thunderstone; 8.3.17 ZyIndex from ZyLab International Inc. 8.4 Information Visualization; 8.4.1 Discovery for Developers from Visible Decisions Inc. 8.4.2 MAPit from Manning & Napier Information Services; 8.4.3 SemioMap from Semio Corporation; 8.4.4 SmartContent System from Perspecta; 8.4.5 Spotfire Pro from IVEE Development; 8.4.6 UMAP from TriVium; 8.4.7 Visual Insights from Lucent Technologies; 8.4.8 VizControls from Inxight Software, Inc. 8.4.9 WEBSOM from Helsinki University; 8.5 Extended Relational Databases; 8.6 Data Marts; 8.6.1 Data Mart Solution from Sagent Technology; 8.6.2 Intelligent Warehouse from Platinum Technology; 8.6.3 PowerMart Suite from Informatica Corp. 8.6.4 Tapestry from D2K Inc. 8.6.5 Visual Warehouse from IBM Corporation; 8.7 Knowledge Management Systems; 8.7.1 Agentware i3 from Autonomy, Inc. 8.7.2 Dataware II Knowledge Mangement Suite from Dataware, Inc. 8.7.3 deliveryMANAGER from VIT; 8.7.4 FireFly Passport Office from Microsoft Corporation; 8.7.5 Folio Suite from Open Market, Inc. 8.7.6 InfoMagnet from CompassWare Development, Inc. 8.7.7 Knowledge Server from Intraspect; 8.7.8 Knowledge from KnowledgeX, Inc. 8.7.9 Livelink Intranet from Open Text, Inc. 8.7.10 WisdomBuilder from WisdomBuilder, LLC; 8.7.11 WiseWire from Lycos Corporation; 8.8 Suitability for Web Farming; 8.8.1 General Synopsis; 8.8.2 Production Database; 8.8.3 Discovery; 8.8.4 Acquisition; 8.8.5 Structuring; 8.8.6 Dissemination; Chapter 9: Resources; 9.1 General Discovery Services; 9.1.1 For Current Information; 9.1.2 AltaVista; 9.1.3 Argus Clearinghouse; 9.1.4 Excite; 9.1.5 HotBot; 9.1.6 InfoSeek; 9.1.7 Inter-Links; 9.1.8 InterNIC WhoIs Search; 9.1.9 Librarians' Index; 9.1.10 Lycos; 9.1.11 Magellan; 9.1.12 Mining Company; 9.1.13 Northern Light; 9.1.14 Open Text Pinstripe; 9.1.15 WebCrawler; 9.1.16 Yahoo!; 9.2 Meta-Discovery Services; 9.2.1 Daily Diffs (by InGenius Technologies, Inc. 9.2.2 Mamma.Com; 9.2.3 MetaCrawler; 9.2.4 MetaFind; 9.2.5 SavvySearch; 9.2.6 Search.Com; 9.3 Specialized Resource Centers; 9.3.1 Business Researcher's Interests; 9.3.2 Data Warehouse Center; 9.3.3 Polson's Industry Research Desk; 9.4 General Content Providers; 9.4.1 Acxiom; 9.4.2 American Business Information; 9.4.3 Amulet; 9.4.4 Corporate Technology Information Services; 9.4.5 Dialog Information Services; 9.4.6 Disclosure; 9.4.7 Dow Jones Business Information Services; 9.4.8 EBSCO Information Services; 9.4.9 EDGAR-Online; 9.4.10 Electric Library; 9.4.11 Encyclopaedia Britannica; 9.4.12 Fairfax RESEARCH (AgeSearch); 9.4.13 FreeEDGAR; 9.4.14 GaleNet; 9.4.15 Hoover's, Inc. 9.4.16 Infobase Publishers; 9.4.17 Information Access Company; 9.4.18 Information America; 9.4.19 Information Express; 9.4.20 Information Handling Services; 9.4.21 Information Quest; 9.4.22 Institute for Scientific Information; 9.4.23 Investext Group; 9.4.24 KnowX; 9.4.25 LEXIS-NEXIS; 9.4.26 M.A.I.D. Profound; 9.4.27 Manning & Napier Information Services; 9.4.28 MicroPatent; 9.4.29 Moody's Financial Information Services; 9.4.30 NewsNet; 9.4.31 ProQuest Direct; 9.4.32 Questel-Orbit Online Services; 9.4.33 SilverPlatter Information, Inc. 9.4.34 TextWise; 9.4.35 Thomas Register; 9.4.36 Thomson & Thomson; 9.4.37 WESTLAW; 9.5 Industry-Specific Content Providers; 9.5.1 Electric Power Research Institute; 9.5.2 MediaTrak; 9.5.3 PaperChase; 9.5.4 Pharmsearch; 9.5.5 Trade Dimensions; 9.6 Market Research Firms; 9.6.1 Dataquest; 9.6.2 Harte-Hanks Direct Marketing; 9.7 Library Services; 9.7.1 Alexandria Digital Library; 9.7.2 Berkeley Digital Library SunSITE; 9.7.3 CARL Corporation; 9.7.4 Internet Archive; 9.7.5 Microsoft Corporate Library; 9.7.6 Online Computer Library Center (OCLC); 9.8 General News Agencies; 9.8.1 Wall Street Journal Interactive Edition; 9.9 IT Trade Publications; 9.10 Investment Services; 9.10.1 IPO Central; 9.11 Related Publications; 9.11.1 CyberSkeptic's Guide to Internet Research; 9.11.2 D-Lib Magazine; 9.11.3 Fulltext Sources Online; 9.11.4 InterNIC News; 9.11.5 Information Today; 9.11.6 Online Inc. 9.12 Professional Societies; 9.12.1 American Society of Information Science; 9.12.2 Association for Computing Machinery; 9.12.3 Association for Information and Image Management; 9.12.4 Association of Independent Information Professionals; 9.12.5 Association of Research Libraries; 9.12.6 Coalition for Networked Information; 9.12.7 European Information Researchers Network; 9.12.8 Information Professionals Institute; 9.12.9 Internet Society; 9.12.10 Library and Information Technology Association; 9.12.11 Society for Competitive Information Professionals; 9.12.12 Society for Insurance Research; 9.12.13 Special Libraries Association; 9.13 Book Distributors; 9.14 Intelligence and Investigative Resources; 9.14.1 Avert, Inc. 9.14.2 Jane's Information Group; 9.14.3 Strategic Forecasting Intelligence Services; 9.15 U.S. Government Agencies; 9.15.1 Federal Web Locator; 9.15.2 THOMAS; 9.15.3 PACER-Public Access to Court Electronic Records; 9.15.4 Securities and Exchange Commission; 9.15.5 Social Security Administration; 9.15.6 U.S. Bureau of the Census; 9.15.7 U.S. Intelligence Community; 9.15.8 U.S. Patent and Trademark Office; 1; Chapter 10: Techniques; 10.1 Discovery; 10.1.1 The Terrain of Cyberspace; 10.1.2 Discovery Skills; 10.1.3 Starting Point; 10.1.4 What Is Indexed?; 10.1.5 Refining the Search; 10.1.6 Advanced Searching; 10.1.7 Profiling Web Pages; 10.2 Acquisition; 10.2.1 Web Crawling; 10.2.2 Accessing Dynamic Pages; 10.2.3 Organizing Dynamic References; 10.2.4 Other Acquisition Techniques; 10.3 Structuring; 10.3.1 Organizing Web Content; 10.3.2 Parsing Semistructured Data; ptPart Four: Harvesting the Crop; 1; Chapter 11: Society; 11.1 Information Ecology; 11.1.1 Public Good or Private Property; 11.1.2 Attention Economics; 11.2 Privacy and Confidentiality; 11.2.1 U.S. Freedom of Information Act; 11.2.2 U.S. Federal Privacy Act; 11.2.3 Code of Fair Information Practices; 11.2.4 U.S. Fair Credit Reporting Act; 11.2.5 TRUSTe; 11.2.6 Firefly Network Privacy Policy; 11.2.7 IBM Fair Information Practices; 11.2.8 Platform for Privacy Preferences Project of W3C; 11.2.9 Further Information about Privacy; 11.3 Intellectual Property Rights; 11.3.1 Protection of Databases; 11.3.2 Fair Use; 11.3.3 Copyright Clearance Center; 11.3.4 Infringements on the Web; 11.3.5 Further Information about Intellectual Property Rights; 11.4 Competitive Intelligence; 11.5 Industrial Espionage; 11.6 Information Warfare; 11.7 Code of Ethics for Web Farming; 1; Chapter 12: Challenges; 12.1 The Challenges-Chapter by Chapter; 12.2 The Big Picture; 12.3 Your Response; 12.4 Web Farmers, Unite!; Glossary; Acronyms; Bibliography; For More Information; About the Author

Additional information

GOR002807665
9781558605039
1558605037
Web Farming for the Data Warehouse by R.D. Hackathorn
Used - Good
Hardback
Elsevier Science & Technology
19981221
448
N/A
Book picture is for illustrative purposes only, actual binding, cover or edition may vary.
This is a used book - there is no escaping the fact it has been read by someone else and it will show signs of wear and previous use. Overall we expect it to be in good condition, but if you are not entirely satisfied please get in touch with us

Customer Reviews - Web Farming for the Data Warehouse