Pages

Welcome

Welcome To JPDESIGNS Blog

Subscribe in Reader

Subscribe in Reader

Friday, 27 February 2015

Introduction to Data Mining

Introduction to Data Mining

This is an introduction to Data Mining, I have also put it into a Podcast available in the Podcast sidebar.

Data Mining

Business Intelligence (BI) Systems

·    Business intelligence (BI) systems are information systems that assist managers and other professionals:
-To analyse current and past activities
-To predict future events
·    Two broad categories:
-Reporting
-Data mining

Data for BI Systems

·    BI systems obtain data in three ways:
-From the operational database:
Read and process data only
DO NOT insert, modify or delete operational data
-From extracts from the operational database:
Data is in a BI DBMS
May be a different DBMS than the operations DBMS
-From data purchased from data vendors

Data Mining Applications

·     Data mining applications are used to:
-preform what-if analysis
-Make predictions
-Facilitate decision making
·    Data mining applications use sophisticated statistical and mathematical techniques

The convergence of the Disciplines

Statistics/Mathematics, artificial intelligence machine learning, huge databases, data management technology, cheap computer processing and storage, sophisticated marketing- finance and other business professionals all converge into data mining

Data Mining

·    The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases and using it to make crucial decisions, (Simoudis, 1996)
·    Involves the analysis of data and the use of software techniques for finding hidden and unexpected patterns and relationships in sets of data
·    Data mining can provide huge paybacks for companies who have made a significant investment in data warehousing
·    Relatively new technology, however already used in a number of industries

·     Retail / Marking
-Identifying buying patterns of customers
-Finding associations among customer demographic characteristics
-Predicting response to mailing campaigns
-Market basket analysis
·    Banking
-Detecting patterns of fraudulent credit card use
-Identifying loyal customers
-Predicting customers likely to change their credit card affiliation
-determining credit card spending by customer groups
·    Insurance
-Claims analysis
-predicting which customers will but new polices
·    Medicine
-Characterizing patient behaviour to predict surgery visits
-identifying successful medical therapies for different illnesses
·    Mining analogy:
-large volumes of data are sifted in an attempt to find something worthwhile. In a mining operation large amounts of low grade materials are sifted through in order to find something of value

Comparison of DM and DBMS

·    DBMS queries based on the data held e.g.
-last months sales for each product
-sales grouped by customer age etc.
-list of customer who lapsed their policy
·    Data Mining infer knowledge from the data held to answer queries e.g.
-what characteristics do customers share who lapsed their policies and how do they differ from those renewed their policies?
-why is the Cleveland division so profitable?

Data Warehousing

Modern organizations are drowning in data but starving for information. Why?
1    .   Information gap – fragmented way organizations have developed information systems and supporting databases for many years. Difficult for managers to locate and use accurate information.
2   .       Most system developed to support operational processing, with little thought given to the information or analytical tools needed for decision making.
·    Operational processing – Transaction processing – captures, stores and manipulates data to support daily operations of business
·    Information processing  - analysis of data to support decision making
·    Bridging the information gap are data  warehouses that consolidate and integrate information from many internal and external sources and arrange it in a meaningful format for making accurate and timely business decision
·    They support executives, managers and business analysts in making complex decisions through applications such as
Analysis of trends
Target marking
Competitive analysis
Customer relationship management…

Definition

Data warehouse:

A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision-making processes
·    Subject-oriented:
e.g. customers, patients, students, products

·    Integrated:
Consistent naming conventions, formats, encoding structures; from multiple data sources
·    Time –variant:
Can study trends and changes
·    Nonupdatable:
Read-only, periodically refreshed from operational systems – cannot be up dated by the end users.

Data Mart

A data warehouse that is limited in scope

Benefits of Data Warehousing

·         Potential high returns on investment
·         Competitive advantage
·         Increased productivity of corporate decision-makers

Data Warehousing Institute Awards

·         Past winners include:
·         Continental Airlines
·         Toyota
·         Bank of America
·         Lowa Department of Revenue

Operational Data Sources

·     Main sources are online transaction processing (OLTP) databases.
Also include sources such as personal databases and spreadsheets, Enterprise Resource Planning (ERP) files, and web usage log files.
Periodic extraction           data is not completely current in warehouse

What is a Data Mart? (Smaller version)

Provides localization for departments or functions
Reduces demands on the
·         Warehouse
·         Network
Independent data mart data warehousing architecture ( Extract, Transform, Load)
Data marts: Mini-warehouses, limited in scope

The ETL Process

·         Capture /Extract
·         Scrub or data cleansing
·         Transform
·         Load and index
ETL = Extract, Transform, and Load

Capture /Extract….obtaining a snapshot of a chosen subset of the source data for loading into the data warehouse
Static extract = capturing a snapshot of the source data at a point in time
Incremental extract = capturing changes that have occurred since the last static extract
Scrub/Cleanse… uses pattern recognition and AI techniques to upgrade data quality
Fixing errors:  misspellings, erroneous dates, incorrect field usage, mismatched addresses, missing data, duplicate data, inconsistencies
Also: decoding, reformatting, time stamping, conversion, key generation, merging, error detection/logging, locating missing data
Transform = convert data from format of operational system to format of data warehouse
Record-level: Selection-data partitioning, Join-data combining, Aggregation-data summarization
Single-field – from one field to one field, multi-field – from many fields to one, or one field to many
Load/Index = place transformed data into the warehouse and create indexes  
Refresh mode: bulk rewriting of target data at periodic intervals

Update mode: only changes in source data are written to data warehouse

Introduction To DataBases

Introduction To Databases

This post is to give a basic introduction to databases by giving some definitions and explanations of some of the parts in a database and some types.
There is an accompanying podcast for this to assist you.

Definitions

Database:  Organized collection of logically related data.
Data: Stored representations of meaningful objects and events                                                                                            
- Structured: Numbers, text,dates.                                                                                                             
- Unstructured: images, video, documents
Information: Data processed to increase knowledge in the person using the data.
Metadata: Data that describes the properties and context of use data.
Graphical displays: Turns data into useful information that managers can use for decision making and interpretation.

The characteristics of Databases

·         The purpose of a database is to help people track things of interest to them.
·         Data is stored in tables, which have rows and columns like a spreadsheet. A database may have multiple tables, where each table stores data about a different thing.
·         Each row in the table stores data about an occurrence or instance of the thing of interest.
·         A database stores data and relationships.
·         Tables are related through Primary and foreign keys.

Why Use A Database?

·         The purpose of a database is to help people and organizations keep track of things.
·         Problems of using list to store data.
-Data inconsistencies
-Data Privacy: The departments want to store some, but not all, of their data.

Components of a Database system

·         User -- Database Application -- DBMS -- Database  (DBMS = Data Base Management System)
     -Create
     - Process
     - Administer

Components of a Database system with SQL

·         User -- Database Application –SQL-- DBMS – Database
       Create
     - Process
     - Administer


Applications, the DBMS, and SQL

·         Applications are the computer programs that users work with.
·         The Database Management System (DBMS) creates, Processes, and administers databases.
·         Structured Query Language (SQL) is an internationally recognized standard database language  that is used by all commercial DBMS.

Database Applications

·         Create and process forms
·         Process user queries
·         Create and process reports
·         Execute applications logic
·         Control application


The DBMS

·         Create database
·         Create tables
·         Create supporting structures(e.g. indexes)
·         Read database data
·         Modify (insert, update, or delete data)
·         Maintain database structures
·         Enforce rules
·         Control concurrency
·         Provide security
·         Perform backup and recovery

Database Contents

·         Tables of user data
·         Metadata
·         Indexes
·         Stored procedures
·         Triggers
·         Security data
·         Backup/ recovery data

Microsoft Access

 Microsoft Access is a low-end product intended for individual users and small work groups.
·        Microsoft Access tries to hide much of the underlying database technology from the user.
·        A good strategy for beginners, but not for database professionals.

What Is Microsoft Access?

·         Microsoft Access is a DBMS plus an application generator:
  -DBMS creates, processes, and administers Microsoft Access Database.
  -the application generator includes query, form, and report components.
·         The Microsoft Access DBMS engine is called Jet, which is not sold as a separate product.
 Microsoft Access 2000 and later can be used as an application generator for the Microsoft SQL  Server DBMS

Prominent DBMS Products

·         Microsoft Access 2010
·         Microsoft SQL Server 2008
  -New: Microsoft SQL Server 2012 Express
·         Oracle Corporation Oracle Database 12c
·         MySQL 5.6
·         IBM DB2 

Three Types of Database Design

·        From Existing Data
Analyse spreadsheets and other data tables
Extract data from other databases
Design using normalization principles
·       New System Development
Create data model from application requirements
Transform data model into database design
·       Database Redesign
Migrate databases to newer databases
Integrate two or more databases
Reverse engineer and design new databases using normalization principles and data model transformation

The Relational Database Model

·         The dominant database model is the relational database model  - all current major DBMS products      are based on it.
·         It was created by IBM engineer E.F. Codd in 1970
·         It was based on mathematics called relational algebra

Object Oriented DBMS (OODBMS)

·         Object-oriented programming started in the mid-1980s
·         Goal of OODBMS is to store object-oriented programming objects in a database without having to  transform them into relational format
·         Object-relational DBMS products, such as Oracle 8i and 9i, allow both relational and object views of  data on the same database
·         Currently, OODBMS have not been a commercial success due to high cost of relational to object-  oriented transformation

Functions of a DBMS

·         Data storage, retrieval, and update.
-fundamental function of DBMS
·         Transaction support
-A mechanism that ensures the all or nothing property of transactions enforced
·         Concurrency Control services
-ensures that updates are carried out properly when multiple users are accessing the same data.
·         Recovery services
-recovering the database in the event that it is damaged in some way.
·         Authorization services
-Mechanism to ensure that only authorised users can access the database
·         Support for data communication
-needs to be able to integrate with data comms software 

Thursday, 26 February 2015

Hexadecimal Conversion

Hexadecimal Conversion

Hi Guys,
in this post I have put a screen cast of the Hexadecimal table and converted a Hexadecimal number to Decimal.

Hexadecimal  Conversion Table

Decimal     Hexadecimal
   0                     0
   1                     1
   2                     2
   3                     3
   4                     4
   5                     5 
   6                     6
   7                     7
   8                     8
   9                     9
  10                    A
  11                    B
  12                    C
  13                    D
  14                    E
  15                    F






Tuesday, 24 February 2015

Systems Analysis Methods Part 2

 Systems Analysis Methods Part 2

Project Management is the process of planning and controlling development of a system within a specified timeframe at a minimum cost with the right functionality. This applies to any system.
A project manager has the primary responsibility for managing the hundreds of tasks and roles that need to be carefully coordinated.
In the 2000 Standish Group Study
Only 28% of system development projects successful.
72% of projects cancelled, completed late, completed over budget, and/or limited in functionality.
Key Steps in Managing a project
·         Identifying project size (scope)
·         Creating and managing the work-plan
·         Staffing the project
·         Coordinating project activities
Types of Project Management Software: Microsoft Project, Plan View, PM Office
Identifying project size
Project Manager’s Balancing Act
·         Project Management involves making trade-offs
·         Modifying one element requires adjusting the others

Project Estimation (Guess –ta- mate)
·         The process of assigning projected values for time and effort
·         Sources of estimates
¨       Methodology in use
¨       Actual previous projects
¨       Experienced developers
·          Estimates begin as a range and become more specific as the project progresses

Planning
Analysis
Design
Implementation
Typical industry standards for business applications
15%
20%
35%
30%
Estimates based on actual figures for first stages of SDLC
Actual
Estimated
Estimated
Estimated

4 person
5.33 person
9.33 person
8 person

months
months
months
months
SDLC= system development life cycle

Creating and Managing the work plan
Work Plan Information
Example
Name of  task
Start date
Completion date
Person assigned
Deliverable (s)
Completion status
Priority
Resources needed
Estimated time
Actual time
Preform economic feasibility
Jan 05,2005
Jan 19, 2005
Project sponsor: Mary Smith
Cost-benefit analysis
Open
High
Spreadsheet
16 hours
14.5 hours
LOC = Line of Code
Project Work plan
·         List of all tasks in the work breakdown structure, plus
·         Duration of task
·         Current task status
·         Task dependencies
·         Milestone (dates)
Tracking Project Tasks
·         Gantt Chart
·         Bar chart format
·         Useful to monitor project status at any point in time
Pert Chart
·         Flowchart format
·         Illustrate task dependencies
Identifying Tasks
·         Methodology
§  Using standard list of tasks
·         Top-down approach
·         Identify highest level tasks
·         Break them into increasingly smaller units
·         Organize into work breakdown structure
Managing Scope
·         Scope creep (size)
·         JAD and prototyping(joint application development)
·         Formal change approval
·         Defer additional requirements as future system enhancements



Time boxing
·         Fixed deadline
·         Reduced functionality, if necessary
·         Fewer “finishing touches”
Staffing the Project
Staffing Attributes
·         Staffing levels will change over a project’s lifetime
·         Adding staff may add more overhead than additional labour ”Brooke” said that
·         Using teams of 8-10 reporting in a hierarchical structure can reduce complexity

Coordinating Project Activities
Case Tools
Planning              Analysis               Design                  Implementation

                Upper CASE                                        Lower CASE
                                                Integrated CASE (I-CASE)
Classic Mistakes
·         Overly optimistic schedule
·         Failing to monitor schedule
·         Failing to update schedule
·         Adding people to a late project
Oversight committee

Project manager works with Client, Users, Subcontractors, Team leaders, They are all members

Feasibility
Feasibility is a measure of how beneficial or practical the development of an information system will be to an organization
Feasibility analysis/study is the process by which feasibility is measured. A project that is feasible at one point in time may become infeasible at a later point



Concept of feasibility checkpoints
Four Tests for Feasibility
1.       Operational
2.       Technical
3.       Schedule
4.       Legal
5.       Economic

·         Operational feasibility
Operational feasibility measures how well the solution of problems or specific solution will work in the organization
·         Feasibility tests
It is also a measure of how people feel about the system/project
How do the end-users and management feel about the problem solution
·         Technical feasibility
Measures the practicality of a specific technical solution and the availability of technical resources and expertise
Most difficult area to asses at this stage
Go/ No Go Design
Feasibility tests
Three major considerations
Development Risk: can the system element be designed so that necessary function and performance are achieved within the constraints uncovered during analysis
Resource availability:
Has relevant technology progressed to a state to support the system?
Schedule feasibility:
Measures how reasonable the project timeline is given our technical expertise, are the project deadlines reasonable?
Missed schedule are bad but inadequate systems are worse
Feasibility Tests need to determine whether the deadlines are mandatory or desirable. It is preferable (unless the deadline is absolutely mandatory) to deliver a properly functioning information system late than to deliver an error-prone information system on time.
Legal Feasibility
Legal Feasibility is a determination of any infringement, violation or liability that could result from development of a system.
All projects are feasible – given unlimited resources and infinite time!


Economic Feasibility
Measures the cost effectiveness of a project or solution. Often called a cost-benefit analysis.
Generally the bottom line consideration for most systems. Exceptions: national defence systems, high technology applications e.g. Space Program.
System Development Costs
Usually once off costs that will not recur after the project has been completed.
·         Facilities
·         Equipment and installation
·         Software and licences
·         Consulting fees
·         Training/ personnel costs
The lifetime benefits must recover both the developmental and operating costs. (Key)
System Operating Costs
System Operating Costs Recur throughout the lifetime of the system.
Costs classified as fixed and variable
Fixed costs  occur at regular intervals and at relatively fixed rates
·         Lease
·         Salaries
Variable costs occur in proportion to some usage factor
·         Supplies
·         Overhead costs         e.g. utilities, maintenance
What Benefits Will The System Provide?
Benefits normally increase profits or decrease costs.
Tangible Benefits
Tangible Benefits are those that can be easily quantified.
Measured in terms of monthly or annual savings or profit to the firm.
·         Fewer processing errors
·         Reduced expenses
·         Increased sales
·         Reducing staff- automation of manual functions




Intangible benefits
Intangible benefits are those benefits believed to be difficult or impossible to quantify
Improved customer goodwill
Improved employee moral
Sales tracking system which leads to better information for marking decisions
·         Reputation
Pay Back Analysis
Break-even point is when lifetime benefits will overtake the lifetime costs.
A.      Will our current printer be able to handle the new reports and forms required of a new system?    Technical.
B.      What are the fixed and variable costs of the operating the system?    Economic
C.      Does the system provide adequate throughput and response time?  Operational
D.      Does the system offer adequate service level and capacity to reduce the costs of business or increase the profits of the business?  Operational
E.       What are the tangible and intangible benefits of the system?  Economic
F.       Does the system offer adequate controls to ensure against fraud and embezzlement and to guarantee the accuracy and security of data and information?   Operational 
G.     Does the system make maximum use of available resources, including people, time, flow of forms, minimum processing delays, and the like?   Operational
H.      Does management support the system?     Operational
I.        What is the net present value of the system?    Economical
J.        How will the working environment of the user change?     Operational
K.      How do the users feel about their role in the new system?  Operational
L.       Do we have the expertise to implement the solution?     Technical
M.    What is the payback period for the proposed system solution?      Economical
N.     Does the system provide users and managers with timely, pertinent, accurate, and usefully formatted information?  Operational
O.     What is the return on investment for the new system?     Economical
P.      Is the project deadline mandatory or desirable?   Schedule
Q.     Does the system provide desirable and reliable service to those who need it?    Operational
R.      Is the system flexible and expandable?    Operational
S.       Are the resource available in our data processing?     Technical
Sample Candidate System
·         Custom
·         Off the shelf
Feasibility analysis matrix   is a tool
We weight each of the of the types of feasibility in percentages and then sore each section


Net Payback Analysis
The time value of money recognizes that a dollar today is worth more than a dollar one year from now.
(Future) value (FV) of £1 in n years, at interest rate, i, can be described by the following equation.
FV =PV(1+i)
Where
FV = Future value
PV = Present value
n = Number of years
i = Interest rate (discount rate)
Example: Net payback analysis
The value of €500 after 2 years at an annual interest  rate of 6% would be:
FV =500(1+0.06)2 = 500(1.236)=€561.80
Restating the above equation:
FV/(1+i)n =PV
FV x 1/(1+i)n = PV
Discount factor
Example: Net Payback Analysis
If you were offered €561.80 in two years time, 6%discount rate, what is the value today?
€561.80 x 1/(1+0.06)2 = PV
PV = €561.80 X 1/(1.1236)
PV = €561.80 X 0.8899
PV = €500
The discount factor is always:
1/(1+i)n
The discount factor on a sum to be received a year from now (6%P.a)
1/(1.06)1=0.94
Two years from now:
1/(1.06)2=0.889


ERD:  is an Entity Relationship Diagram
Focus on system data
DFD:  is a Data Flow Diagram
Focus on system processes
System Modelling
A model is a representation of reality
Logical models
Show what a system “is” or “does”. They are implementation- independent- illustrate the essence of the system
Physical model
Show what a system “is” or “does” and how  the system is physically and technically implemented- implementation-dependent
System Modelling
System analysis activites focus on the logical system models:
Logical models removes biases
Logical models reduce the risk of missing business requirements
Allow us to communicate with end users in a non-technical language
Data Modelling
Sometimes called data base modelling because a data model is usually implemented as a data base
The process of constructing data models helps analysis ans users quickly reach consensus on business terminology and rules.
Data Models are frequently referred to entity relationship diagrams (ERD)
Entities
All systems contain data-Data describes “things”
A concept to abstractly represent all instances of a group of similar “things” is called an entity.
An entity is something about which we want to store data.
E.g Persons, Places, Objects, Events about which we need to capture and store data.





An entity instance is a single occurrence of an entity
Attributes
The pieces of data that we want to store about each instance of a given entity are called attributes
Values for each attribute are defined in terms of three properties: data type, domain, and default
The data type for attribute:
E.g
Number   10-99
Text           max size 30
Date         mmddyy
The domain
The domain of an attribute defines what values an attribute can legitimately take on.
E.g
Number   integers- 10-99
Text          max size 30
Date        format-  mmddyy
The default value for an attribute is that value which will be recorded if not specified by the user
 E.g number = 0
Key
Hence every entity must have an identifier or key
Candidate Key: where an entity can have more than one –Key- each of these attributes is called a candidate key
Candidate Key: where a group of attributes is required to uniquely identifies an instance of an entity
DVD entity in video store
Title NO + Copy No
Primary Key
Is that candidate key which will most commonly be used to uniquely identify a single entity instance
Type ID (instance Primary Key)
Alternate keys
Are those not specified as primary keys



Customer
Customer Number (PK)
Customer Name
Shipping address*
Billing address*
Balance due*


A primary key imported into order becomes a foreign key

Order
Order number (PK)
Order date
Order total cost
Customer number (FK)


                                                                                Has placed
 




*alternative keys
If I took customer name and customer number it would be a concatenated key

 


Ordered Product
Order product ID (PK)
Order number (FK)
Product number (FK)
Quantity ordered unit price

Inventory product
Product number (PK)
Product name
Unit of measure
Unit price



                                                                                Sold as 



 A Subsetting criteria
Is an attribute whose values divide all entity instances into useful subsets.
Need to identify all male students and all female students
Relationships
A relationship is a natural business association that exits between one or more entities
A connecting line between two entities on an entity relationship diagram (ERD) represents a relationship
A verb phrase describes the relationship
All relationships are implicitly bidirectional, they can be interpreted in both directions (binary)

Curriculum

Student
                                                                O  or more students
                                                          Is being studied by              is enrolled in     
                                                                                                                           1 or many
Cardinality
The complexity or degree of each relationship is called cardinality
Cardinality defines the minimum and maximum number of occurrences of one entity foe a single occurrence of related entity
E.g must there exist an instance of student for each instance of curriculum? No
Must there exist an instance of curriculum for each of students? Yes
3 important things in a ERD
1.       Attributes
2.       Keys (primary, alternate, foreign)
3.       Cardinality – show symbol and relationship E.g 1 to many
The degree of a relationship is the number ot entities that participate in the relationship
A Binary relationship has a degree = 2
Recursive relationship degree =1
An associative entity is an entity that inherits its primary key from more than one other entity (parents)
Foreign Keys
A foreign key is a primary key of one entity that is duplicated in another entity for the purpose of identifying instances of a relationship
A non –specific relationship (or many –to-many relationship) is one in which many instances of one entity are associated with many instances of another entity

JAD (joint application development) sessions
Facts collected by sampling existing forms and files; researching similar system; surveys of users and management
How to construct data models
Step 1 – Identify Entity
Entities should be named with nouns that describe the person, event, place, or tangible thing about which we want to store data.
Step 2 – Define keys for each entity
The value of a key should not change over the life time of each entity instance.
The value of a key cannot be null (entity integrity)
Step 3 Draw a rough draft of the ERD model
Step 4 Identify Data Elements/ Attributes
Step 5 Draw the Fully Attributed Data Model
Process Concepts & Conventions
Logical process are work or actions that must be performed no matter how you implement the system
Each process will be implemented as one or more physical process that may include:
·         Work performed by people
·         Work performed by robots / machines
·         Work performed by computer soft ware
Three types of logical processes: functions, events, and elementary process.
Functions
A function is a set of related and on-going activates of the business. A function has no start or end  it just continuously performs it’s work as needed.
Events
An event is a logical unit of work that must be completed as a whole
Elementary process
An event process can be further decomposed into elementary processes that illustrate in detail how the system must respond to an event
Three Errors with process
A black hole is when a process has inputs but no outputs. Data enters the process and then disappears
A miracle is when a process has output but no input
A grey hole is when the inputs of a process are insufficient to produce the output(most common)

Process modelling  
 A technique for organizing and documenting the structure flow of data through a system analysis /processes
A data flow diagram (DFD)/(bubble graph)
Process model consists of data flow depicts the flow of data through a system diagram and the work or processing preformed by the system.
As information moves through software , modified by series of transformations
Data Flow Diagram
Uses three symbols and one connection

What are the symbol of a DFD (data flow diagram)


Process
Rounded rectangle or circles represent process performs transformation on its input data to yield tis output data

Process

Process
 


External agents or external entity
Square represent External agents, a source of system inputs or a sink of system outputs residing outside system

Eternal
Agent

1.       Data flows should not split into two or more different data flows

2.       Don’t  connect 2 exit entities NB don’t connect to data store
3.       Process need to have at least  one input data flow and one output data flow,
Data stores
Open ended boxes represent data stores, repository of data- never to be shown in a context diagram
Data Flows
Arrows represent data flows, to connect processes to each other
Rules for DFDs
1.       Each object must connect to at least one other process- objects is external entity, process, data store, data flow
2.       Each process must use at least one input data flow and produce one output data flow
3.       Each item in a process’s data flow must correspond or derive from the process’s input data flow
4.       Each data flow to or from a data store/ external entity must connect to a process. Maximum 7 process per DFD
Arrange process so that major data flow is from left to right
Process: verbs
Data flows: external entities; data stores: nouns
A Process
Is work performed on, or in response to, incoming data flow or conditions


Process Decomposition
When a complex system is too difficult  to fully understand when viewed as a whole (meaning as a single process)
System analysis separates a system into its component subsystems, which in turn are decomposed into smaller subsystem
Process abstraction
Process of reducing complexity by mapping details into one higher level concept, ie. A concept diagram
Never add data stores to a context diagram
Only one process, data flows and external entities are placed in a context diagram
Requirements-Gathering Techniques
Interviews
Most commonly used technique
Basic steps
·         Selecting interviewees
·         Designing interview questions
·         Prepare for the interview
·         Conducting the interview
·         Post-interview follow up
Selecting interviewees
·         Based on information needs
·         Best to get different perspectives
Managers
Users
Ideally, all key stakeholders
·         Keep organizational politics in mind
Designing interview questions
·         Unstructured interview useful early in information gathering
Goal is broad, roughly defined information
·         Structured interview useful later in process
Goal is very specific information
Preparing for interview
·         Prepare general interview plan
List of question
Anticipated answers and follow ups
·         Confirm areas of knowledge
·         Set priorities in case of time shortage
·         Prepare the interviewee
Schedule
Inform of reason for interview
Inform of areas of discussion
Conducting the interview
·         Appear professional and unbiased
·         Record all information
·         Check on organizational policy regarding tape recording
·         Be sure you understand all issues and terms
·         Separate facts from opinions
·         Give interviewee time to ask questions
·         Be sure to thank the interviewee
·         End on time
Post –interview follow-up
·         Prepare interview notes
·         Prepare interview report
·         Have interviewee review and confirm interview report
·         Look for gaps and new questions
Joint Application Development(JAD)
·         A structured group process focused on determining requirements
·         Involves project team, users, and management  working together
·         May reduce scope creep by 50%
·         Very useful technique
·         Quality : walkthrough, inspections, and formal technical reviews
JAD Participants
·         Facilitator
Trained in JAD techniques
Sets agenda and guides group processes
·         Scribe(s)
Record content of JAD sessions
·         Users and managers from business area with broad and detailed knowledge
Preparing for JAD sessions
·         Time commitment – ½ day to several weeks
·         Strong management support is needed to release key perticipants from their usual responsibilities
·         Careful planning is essential
·         E-JAD can help alleviate some problems inherent with groups





Conducting the JAD Session
·         Formal agenda and ground rules
·         Top- down structure most successful
·         Facilitator activities
Keep session on track
Help with technical terms and jargon
Record group input
Stay neutral, but help resolve issues
·         Post-session follow up report
Post JAD follow –up
·         Post session report is prepared and circulated among session attendees
·         The report should be completed approximately a week to two after the JAD session
Questionnaires
·         A set of written questions, often sent to a large number of people
·         May be paper-based or electronic
·         Select participants using samples of the population
·         Design the questions for clarity and ease of analysis
·         Administer the questionnaire and take steps to get a good response rate
·         Questionnaire follow up report
Good questionnaire design
·         Begin with nonthreatening and interesting questions
·         Group items into logically coherent sections
·         Do not put important items at the very end of the questionnaire
·         Do not crowd a page with too many items
·         Avoid abbreviations
·         Avoid biased or suggestive items or terms
·         Pre-test the questionnaire to identify confusing questions
·         Provide anonymity to respondents
Document analysis

·         Study of existing material describing the current system
·         Forms, reports, policy manuals, organization charts describe the formal system
·         Look for the informal system in user additions to forms /report and unused form /report elements
·         User changes to existing forms /reports or non-use of existing forms /reports suggest the system needs modification



Observation
·         Watch processes being preformed
·         Users/managers often don’t accurately recall everything they do
·         Checks validity of information gathered other ways
·         Be aware that behaviours change when people are watched
·         Be unobtrusive
·         Identify peak and lull periods
Selecting the appropriate requirements- gathering techniques
·         Type of information
·         Depth of information
·         Breadth of information
·         Integration of information
·         User involvement
·         Cost
·         Combining techniques
Reference information attained from course material 

 
Blogger Templates