CIS 2050 Chapter 5: Data and Knowledge Management – Flashcards
Unlock all answers in this set
Unlock answersquestion
refer to the vast and constantly increasing amounts of data that modern organizations need to capture, store, process, and analyze. Big Data impact the following: - Human Resources (Health Benefits and Hiring with online assessments) - Product Development - Operation - Marketing
answer
Big Data
question
is a repository of historical data that are organized by subject to support deci- sion makers in the organization
answer
Data Warehouse
question
is a group of logically related files that store data and the associations among them. A database consists of attributes, entities, tables, and relationships. Database decisions, in contrast, are much harder to undo. Database design constrains what the organization can do with its data for a long time
answer
Database
question
Dennis Rollins, the owner of a small car lot in Bowdon, Georgia. Dennis needed an effective way to manage the data pertaining to his car lot. Achieving a solid online presence can be difficult for small used car dealers because there are so many makes and models of cars to sell and so many online outlets through which to advertise. That solution came in the form of Dealer Car Search, a company that specializes in creating Web sites for car dealers. Dealer Car Search provides products for small businesses, dealers, and dealer chains. What ultimately makes the company so successful, however, is its database.
answer
5.1 Rollins Automotive
question
When database developers in the firm's MIS group build a database, this tool creates a model of how users view a business activity.
answer
Entity-relationship (ER) Modeling
question
First, the amount of data increases exponentially with time. In addition, data are also scattered throughout organizations, and they are collected by many individuals using various methods and devices. Another problem is that data are generated from multiple sources: internal sources, personal sources, and external sources Another problem arises from the fact that, over time, organizations have developed infor- mation systems for specific business processes, such as transaction processing, supply chain management, and customer relationship management. Information systems that specifically support these processes impose unique requirements on data, which results in repetition and conflicts across the organization.
answer
The Difficulties of Managing Data
question
New York City passed Local Law 11, which man- dated that city agencies systematically categorize data and make them available to the public. To accommodate this initiative, the city had to redefine its data practices. So, in September 2012, it created an "Open Data Policy and Technical Standards Manual," which outlines how city agencies can gather, structure, and automate data flows to meet the requirements of Local Law 11. The goal is to enable developers, entrepreneurs, and academics to put data to work in new and innovative ways. Literally anyone can employ his or her skills and creativity to utilize these data to improve the city's quality of life. Two other factors complicate data management: Federal Regulations and unstructured data overflow.
answer
5.2 New York City Opens Its Data to All
question
are those data that visitors and customers produce when they visit a Web site and click on hyperlinks.
answer
Clickstream
question
refers primarily to problems with the media on which the data are stored. Over time, temperature, humidity, and exposure to light can cause physical problems with storage media and thus make it difficult to access the data. The second aspect of data rot is that finding the machines needed to access the data can be difficult.
answer
Data Rot
question
is an approach to managing information across an entire organization.
answer
Data Governance
question
is a process that spans all organizational business processes and applications. It provides companies with the ability to store, maintain, exchange, and synchronize a consistent, accurate, and timely "single version of the truth" for the company's master data.
answer
Master Data Management
question
are a set of core data, such as customer, product, employee, vendor, geo- graphic location, and so on, that span the enterprise information systems.
answer
Master Data
question
which are generated and captured by operational systems, describe the business's activities, or transactions.
answer
Transaction Data
question
as diverse, high-volume, high-velocity information assets that require new forms of processing to enable enhanced deci- sion making, insight discovery, and process optimization.
answer
Gartner's Big Data
question
defines Big Data as vast data sets that: • Exhibit variety; • Include structured, unstructured, and semi-structured data; • Are generated at high velocity with an uncertain pattern; • Do not fit neatly into traditional, structured, relational databases (discussed later in this chapter); and • Can be captured, processed, transformed, and analyzed in a reasonable amount of time only by sophisticated information systems.
answer
Big Data Institute (TBDI)'s Big Data
question
• Traditional enterprise data—examples are customer information from customer relation- ship management systems, transactional enterprise resource planning data, Web store transactions, operations data, and general ledger data. • Machine-generated/sensor data—examples are smart meters; manufacturing sensors; sen- sors integrated into smartphones, automobiles, airplane engines, and industrial machines; equipment logs; and trading systems data. • Social data—examples are customer feedback comments; microblogging sites such as Twitter; and social media sites such as Facebook, YouTube, and LinkedIn. • Images captured by billions of devices located throughout the world, from digital cameras and camera phones to medical scanners and security cameras.
answer
Big Data generally consists of the following:
question
Volume: We have noted the incredible volume of Big Data in this chapter. Although the sheer volume of Big Data presents data management problems, this volume also makes Big Data incredibly valuable. Velocity: The rate at which data flow into an organization is rapidly increasing. Velocity is critical because it increases the speed of the feedback loop between a company and its customers. Variety: Traditional data formats tend to be structured, relatively well described, and they change slowly. Traditional data include financial market data, point-of-sale transactions, and much more. In contrast, Big Data formats change rapidly. They include satellite imagery, broadcast audio streams, digital music files, Web page content, scans of government documents, and comments posted on social networks.
answer
Big Data has Three Distinct Characteristics:
question
Databases that can manipulate structured as well as unstructured data and inconsistent or missing data; are useful when working with Big Data.
answer
NoSQL Database
question
- Creating Transparency - Enabling Experimentation - Segmenting Population to Customize Actions - Replacing/Supporting Human Decision Making with Automated Algorithms - Innovating New Business Models, Products, and Services - Organizations Can Analyze Far More Data
answer
Leveraging Big Data
question
From the time that businesses first adopted computer applications (mid-1950s) until the early 1970s, organizations managed their data.
answer
File Management Environment
question
is a collection of logically related records.
answer
Data File
question
• Data redundancy: The same data are stored in multiple locations. • Data isolation: Applications cannot access data associated with other applications. • Data inconsistency: Various copies of the data do not agree. • Data security: Because data are "put in one place" in databases, there is a risk of losing a lot of data at once. Therefore, databases have extremely high security measures in place to minimize mistakes and deter attacks. • Data integrity: Data meet certain constraints; for example, there are no alphabetic characters in a Social Security number field. • Data independence: Applications and data are independent of one another; that is, applications and data are not linked to each other, so all applications are able to access the same data.
answer
Database minimizes the following problems:
question
represents the smallest unit of data a computer can process. The term binary means that a bit can consist only of a 0 or a 1.
answer
Bit (Binary Digit)
question
A group of eight bits represents a single character. A byte can be a letter, a number, or a symbol.
answer
Byte
question
A logical grouping of characters into a word, a small group of words, or an identification number. Fields can also contain data other than text and numbers. They can contain an image, or any other type of multimedia.
answer
Field
question
is a grouping of logically related fields; describes an entity.
answer
Record
question
is a logical grouping of related records
answer
Data File (or a Table)
question
is a diagram that represents entities in the database and their relationships.
answer
Data Model
question
is a person, place, thing, or event—such as a customer, an employee, or a product—about which information is maintained. Entities can typically be identified in the user's work environment. A record generally describes an entity.
answer
Entity
question
is a specific, unique representation of the entity. For example, an instance of the entity STUDENT would be a particular student.
answer
Instance of an Entity
question
is each characteristic or quality of a particular entity.
answer
Attribute
question
is the identifier field or attribute that uniquely identifies a record.
answer
Primary Key
question
is another field that has some identifying information but typically does not identify the record with complete accuracy. For example, the student's major might be a secondary key if a user wanted to identify all of the students majoring in a particular field of study.
answer
Secondary Key
question
Document that shows data entities and attributes and relationships among them.
answer
Entity-Relationship (ER) Diagram
question
The process of designing a database by organizing data entities to be used and identifying the relationships among them.
answer
Entity-Relationship (ER) Modeling
question
refers to the maximum number of times an instance of one entity can be associated with an instance in the related entity.
answer
Cardinality
question
refers to the minimum number of times an instance of one entity can be associated with an instances in the related entity.
answer
Modality
question
which are attributes (attributes and identifiers are synonymous) that are unique to that entity instance.
answer
Identifiers
question
a single-entity instance of one type is related to a single- entity instance of another type.
answer
One-To-One (1:1) Relationship
question
This relationship means that a professor can have one or more courses, but each course can have only one professor.
answer
One-To-Many (1:M) Relationship
question
indicates that a student can have one or more courses, and a course can have one or more students.
answer
Many-To-Many (M:M) Relationship
question
is a set of programs that provide users with tools to add, delete, access, modify, and analyze data stored in a single location. DBMSs also provide the mechanisms for maintaining the integrity of stored data, managing security and user access, and recovering information if the system fails.
answer
Database Management System (DBMS)
question
is based on the concept of two-dimensional tables. A relational database generally is not one big table—usually called a flat file—that contains all of the records and attributes.
answer
Relational Database Model
question
is the most popular query language used for this operation. SQL allows people to perform complicated searches by using relatively simple statements or key words. Typical key words are SELECT (to specify a desired attri- bute), FROM (to specify the table to be used), and WHERE (to specify conditions to apply in the query).
answer
Structured Query Language (SQL)
question
In QBE, the user fills out a grid or template—also known as a form—to construct a sample or a descrip- tion of the data desired. Users can construct a query quickly and easily by using drag-and-drop features in a DBMS such as Microsoft Access. Conducting queries in this manner is simpler than keying in SQL commands.
answer
Query By Example (QBE)
question
defines the required format for entering the data into the database. The data dictionary provides information on each attribute, such as its name, whether it is a key or part of a key, the type of data expected (alpha- numeric, numeric, dates, and so on), and valid values. Data dictionaries can also provide information on why the attribute is needed in the database; which business functions, applications, forms, and reports use the attribute; and how often the attribute should be updated.
answer
Data Dictionary
question
is a method for analyzing and reducing a relational database to its most streamlined form to ensure minimum redundancy, maximum data integrity, and optimal processing performance.
answer
Normalization
question
There are hundreds of operational satellites in orbit around the earth. Each one completes an orbit of the Earth approximately every 100 minutes. The cameras and sensors that many of these satellites carry have made satellite imagery pervasive in today's society. Consider the German Aerospace Center, or Deutsches Zentrum für Luft und Raumfahrt, with 7,000-plus employees at 16 locations in Germany. As the country's aerospace agency, the DLR maintains a research opera- tion, known as the German Remote Sensing Data Center (DFD) that focuses on the Earth and on atmospheric observation for global monitoring, environmental studies, and security. The DLR was convinced that the DFD needed a single information management system designed to meet the DFD's various needs, the needs of its commercial clients, and the needs of the German nation. As a result, the DFD developed a Data and Information System (DIMS) to solve the challenge of data storage and archiving.
answer
5.3 Database Solution for the German Aerospace Center
question
is a low-cost, scaled-down version of a data warehouse that is designed for the end-user needs in a strategic business unit (SBU) or an individual department. Data marts can be implemented more quickly than data warehouses, often in less than 90 days
answer
Data Mart
question
• Organized by business dimension or subject. Data are organized by subject—for example, by customer, vendor, product, price level, and region. This arrangement differs from transactional systems, where data are organized by business process, such as order entry, inventory control, and accounts receivable. • Use online analytical processing. Typically, organizational databases are oriented toward handling transactions. That is, databases use online transaction processing (OLTP), where business transactions are processed online as soon as they occur. The objectives are speed and efficiency, which are critical to a successful Internet-based business operation. Data warehouses and data marts, which are designed to support decision makers but not OLTP, use online analytical processing. Online analytical processing (OLAP) involves the analysis of accumulated data by end users. We consider OLAP in greater detail in Chapter 12. • Integrated. Data are collected from multiple systems and then integrated around subjects. For example, customer data may be extracted from internal (and external) systems and then integrated around a customer identifier, thereby creating a comprehensive view of the customer. • Time variant. Data warehouses and data marts maintain historical data (i.e., data that include time as a variable). Unlike transactional systems, which maintain only recent data (such as for the last day, week, or month), a warehouse or mart may store years of data. Orga- nizations utilize historical data to detect deviations, trends, and long-term relationships. • Nonvolatile. Data warehouses and data marts are nonvolatile—that is, users cannot change or update the data. Therefore the warehouse or mart reflects history, which, as we just saw, is critical for identifying and analyzing trends. Warehouses and marts are updated, but through IT-controlled load processes rather than by users. • Multidimensional. Typically the data warehouse or mart uses a multidimensional data structure. Recall that relational databases store data in two-dimensional tables. In contrast, data warehouses and marts store data in more than two dimensions. For this reason, the data are said to be stored in a multidimensional structure.
answer
The basic characteristics of data warehouses and data marts include the following:
question
Storage of data in more than two dimensions; a common representation is the data cube.
answer
Multidimensional Structure
question
A common representation for this multidimensional structure.
answer
Data Cube
question
are subjects such as product, geographic area, and time period that represent the edges of the data cube.
answer
Business Dimensions
question
• Source systems that provide data to the warehouse or mart • Data-integration technology and processes that prepare the data for use • Different architectures for storing data in an organization's data warehouse or data marts • Different tools and applications for the variety of users. (You will learn about these tools and applications in Chapter 5.) • Metadata, data-quality, and governance processes that ensure that the warehouse or mart meets its purposes
answer
The environment for data warehouses and marts includes the following:
question
In addition to storing data in their source systems, organizations need to extract the data, transform them, and then load them into a data mart or warehouse. This process is often called ETL, but the term data integration is increasingly being used to reflect the growing number of ways that source system data can be handled.
answer
ETL or Data Integration
question
Most organizations use this approach, because the data stored in the warehouse are accessed by all users and represent the single version of the truth.
answer
One Central Enterprise Data Warehouse
question
This architecture stores data for a single application or a few applications, such as marketing and finance. Limited thought is given to how the data might be used for other applications or by other functional areas in the organization. This is a very application-centric approach to storing data.
answer
Independent Data Marts
question
This architecture contains a central data warehouse that stores the data plus multiple dependent data marts that source their data from the central repository. Because the marts obtain their data from the central repository, the data in these marts still comprise the single version of the truth for decision-support purposes.
answer
Hub and Spoke
question
to maintain data about the data.
answer
Metadata
question
whose primary role is to create information for other users. - IT developers and analysts
answer
Information Producers
question
utilize information created by others. - managers and executives
answer
iInformation Consumers
question
•End users can access needed data quickly and easily via Web browsers because these data are located in one place. •End users can conduct extensive analysis with data in ways that were not previously possible. •End users can obtain a consolidated view of organizational data.
answer
The benefits of data warehousing include the following:
question
Founded in 1972, Soon Chun Hyang University Hospital has evolved into one of the largest healthcare institutions in South Korea. The hospital operates 2,800 beds in four different cities across the country— Seoul, Gumi, Cheonan, and Bucheon. As the number of patients and the amount of patient data dra- matically increased, SCHUJ faced a growing challenge in continuing to offer an excellent care experience. To maintain its high standards, the hospital needed to reduce admission times, process patient test results more quickly, and transfer patients for diagnosis or treatment at different locations more efficiently. SCHUH launched the Integrated Medical Information System (IMIS) project. The purpose of this project was to replace the information silos located at each of the hospital's four sites with a centralized source of patient information; namely, a data warehouse.
answer
5.4 Hospital Improves Patient Care with Data Warehouse
question
is a process that helps organizations manipulate important knowledge that comprises part of the organization's memory, usually in an unstructured format.
answer
Knowledge management (KM)
question
is another term for knowledge. Knowledge is information that is contextual, relevant, and useful. Simply put, knowledge is information in action.
answer
Intellectual Capital (or Intellectual Assets)
question
deals with more objective, rational, and technical knowledge. In an organization, explicit knowledge consists of the policies, procedural guides, reports, products, strategies, goals, core competencies, and IT infrastructure of the enterprise.
answer
Explicit Knowledge
question
is the cumulative store of subjective or experiential learning. In an organization, tacit knowledge consists of an organization's experiences, insights, expertise, know-how, trade secrets, skill sets, understanding, and learning.
answer
Tacit Knowledge
question
refer to the use of modern information technologies—the Internet, intranets, extranets, databases—to systematize, enhance, and expedite intrafirm and interfirm knowledge management. KMSs are intended to help an organization cope with turnover, rapid change, and downsizing by making the expertise of the organization's human capital widely accessible.
answer
Knowledge management systems (KMSs)
question
is the most effective and efficient ways of doing things.
answer
Best Practices
question
1. Create knowledge. Knowledge is created as people deter- mine new ways of doing things or develop know-how. Sometimes external knowledge is brought in. 2. Capture knowledge. New knowledge must be identify as valuable and be represented in a reasonable way. 3. Refine knowledge. New knowledge must be placed in context so that it is actionable. This is where tacit qualities (human insights) must be captured along with explicit facts. 4. Store knowledge. Useful knowledge must then be stored in a reasonable format in a knowledge repository so that others in the organization can access it. 5. Manage knowledge. Like a library, the knowledge must be kept current. It must be reviewed regularly to verify that it is relevant and accurate. 6. Disseminate knowledge. Knowledge must be made available in a useful format to anyone in the organization who needs it, anywhere and anytime.
answer
The KMS Cycle