Database Design: From Novice to Professional [Ch.3: Initial Requirements and Use Cases]

In this chapter we will foucs on the Analysis part but in the abstract world focus on what the user can do in our system.

Remember the 4 process thinking, Now we consider part of the first step from real-world problem to eventual real-world solution as described in last chapter

There are two things we need to do:

Understand what tasks all the people who will use the system need to carry out.
After that figure out what data we will need to store to support them.

there are two views of the problem. One is the concrete, real-world view from the person who will be the eventual user (I will call this person the client) and the other view is the more abstract model from the person who is designing and possibly developing the system (I’ll call this person the analyst). If you are designing your own database, then wear two hats and swap them as necessary.

Real and Abstract Views of a Problem

The analyst sees the problem in a mostly abstract way. For the type of data-based problems we are considering, the processing can mostly be separated into

• Entering, editing, or otherwise maintaining data

• Extracting information from the database based on some criteria

The first thing an analyst must do is to understand the client’s problem in sufficient detail to help determine the input and output requirements (both immediate and potential). These can be expressed in use cases. The analyst then needs to develop a data model that will support those requirements.

Data Minding

This type of project involves a client who has data that needs to be looked after. This is often the case for research results

but there is often little clear idea of how to store the data most effectively. The analyst’s responsibility here is to think ahead and ask questions about how the data may be used in future situations

Predicting the potential output requirements, given the type of data that is being collected, is one of the most difficult aspects of problems involving storing data.

A careful analysis at this stage helps prevent the very common and infuriating situation of knowing the data is “in there” but not being able to “get it out” conveniently.

Task Automation

This type of project involves a client with a job that needs to be automated. This could be a small business, club, or school that has been keeping records by hand or with software that needs to be updated.

Maybe they are looking to transfer their data to a database with a web interface. These clients usually have a clear idea of what they do. The analyst’s job here is to separate what the client does from what needs to be recorded and reported and recast the problem

for a task automation problem every problem is different, so we need a general framework for discovering and representing the intricacies of a database problem. A good start is to determine answers to the following questions:

• What does the user do?

• What data is involved?

• What is the main objective of the system?

• What data is needed to satisfy this objective?

• What are the input use cases?

• What is the first data model?

• What are the output use cases?

The preceding steps are iterative. As we find out more about the problem, we will probably have to return to the early steps and adjust them. We will work through these steps in the context of next example(meal deliveries)

Visitors to the city staying in local motel or hotel rooms are offered a service that will deliver them a variety of fast food or takeaway meals (pizzas, burgers, Chinese takeout, and so on). The visitor phones the company and places an order for some meals. A driver is selected and dispatched to pick up the meals from the appropriate fast-food outlets. The driver delivers the meals to the customer, receives the payment, and informs the depot. He also fills in a time sheet, which he returns to the depot later. One of the reasons given for wanting to automate this, currently manual, process is to be able to produce statistics about the numbers of orders taken and about the time taken to complete orders.

now we will drive though each question

1) what does the user do

As a start, it is useful to list the jobs that the user regularly undertakes. Here is a start for the meal deliveries example:

• Receptionist records details of order (address, phone number, meals, total price).

• Receptionist selects a driver and gives him the information about the order.

• Driver picks up meal(s) from fast food outlet(s).

• Driver delivers meal(s) and informs the depot.

• Driver hands in time sheets at end of shift.

• Receptionist or manager produces weekly and monthly statistics.

The first five of the preceding tasks may involve entering data into the system while the last task is reporting on information already in the system.

2) What Data Is Involved?

The tasks described in the previous section are very much stated from the users’ point of view and are what physically take place. We need to step back a bit, put on our analyst’s hat, and think about what data, if any, needs to be recorded or retrieved at each step.

We will show steps in order if someone make a meal order

Take order: We need to be able to identify an order easily. We could refer to the customer and the time of placement, but generally it is easier to assign an order number to make it easier to track the order through its various stages.What about meals? How do we record this information?

maybe customer choosing from a list of meals so that a selection can be made? What about price? If we have data about the meals, we may already know the price.

Dispatch driver: First up, we need to think about how we know which driver is going to deliver the order. Does the system need to keep track of the whereabouts of drivers and determine which driver is the most appropriate? Does the receptionist choose from a list of drivers on duty? Does the system need to keep track of which drivers are available or which are currently on a delivery? If all the drivers are busy, what happens?

Does the system need to record which outlets provided the meal for this order? If the outlets for pizzas and curries are far apart, might two drivers be involved?

Pick up meals: What do we want to record about a driver picking up a meal? Do we want the system to be able to tell us what stage an order is at (e.g., “Curries were picked up at 8:40, pizzas have not been collected yet”)? Do the eventual statistics need to be separated into times for picking up meals and times to deliver meals, or will overall times do?

Deliver meals: If statistics on time are important, recording the time the meals were delivered will be essential. Enter time sheets: Assuming that time sheets are currently managed manually, looking at an existing time sheet will be very helpful. It is possible that the manual time sheet will contain some of the information we have already discussed. Is there any data that we have not recorded yet? Does the system need to record information about pay rates and payments to the drivers?

3) What Is the Objective of the System?

Clearly, a system to record meal deliveries could be quite small or very large depending on how much of the information in the previous section we decide to record. With our analyst’s hat on, we need to sort out the main objectives and provide pragmatic solutions.

One common problem if you are working with other people is that as you ask questions similar to the ones described earlier, your clients may become quite enthusiastic about broadening the scope of the system to include more and more. They will soon settle down when they realize that extras come at a cost.

It is important not to see everything that could be automated as something that should be automated.

Many tasks are much more conveniently done manually Human judgment is also better than a computer’s in many cases. A good example is assigning demonstrators to laboratory classes. While the database may have all the information about requirements and availability, the actual matching up may be better done by a real person who has additional information

It is best to keep the scope of the problem as small and tightly defined as possible in the early stages of the analysis. Satisfy the most pressing requirements first.

Let’s think about the meal delivery example. The initial incentive for developing the database was to provide summary information about the orders and the times involved. Information about orders in a summary might include the total number of orders and/or their combined value, probably within some time frame (weekly or monthly). This information might allow the company to identify some trends and adapt its business accordingly.

Let’s think about the time statistics. How detailed should they be? Here is where you need to be imaginative.

A question such as “What statistics do you want about time?” may not elicit adequate detail from a client. If it doesn’t, you might try to think what could be achieved and try some more specific questions

Here are a few suggestions:

• Do you need to have statistics to back up statements such as “Our meals are delivered within 40 minutes” or “Our average delivery time is 15 minutes”?

• Do you need to be able to break down the delivery time to see where the delays are? For example: How long does an order typically have to wait before a driver becomes available? What proportion of the time is spent waiting for the meals to be prepared? What is the average time taken to deliver a meal from outlet to customer?

• Do you need to be able to break these statistics down by driver? For example, to find out if any drivers are regularly slower than others?

The purpose of these questions is to determine the most pressing requirements. Let’s assume that for this small business the main objective is just to get some idea of the overall times from phone call to delivery. Asking the other questions may (or may not) lead the client to become too ambitious: “I never thought of that. What a good idea. Throw that in as well.”

it is essential to consider how realistic it is to obtain data sufficiently reliable to fulfill these extra ideas. The main objective of overall delivery times isn’t too difficult. It requires the time of the call to be logged and the time of final delivery. Any more detail than that comes at significant cost

If, however, the extra information is one of the main purposes of acquiring the system, there are still issues to consider. How accurate will the data be? If drivers suspect that times are being recorded next to their names, might they feel pressured into being less than accurate sometimes? Setting up a complicated system to analyze inaccurate numbers is a waste of everybody’s time and money.

Let’s assume that after some careful thought it is agreed that only the total delivery time is required. We can now restate the main objectives of the project: To record orders for meals so that summaries of the number, value, and overall time taken to process orders can be retrieved for different time periods.

4) What Data Is Required to Satisfy the Objective?

now with the more clearly stated objective in mind. After further consultation with the client, we can produce some more precise descriptions of the tasks

Take order: If we are to provide statistics by month or week, we will need to record a date. The client has confirmed that there is a price list of different meals, and it would be useful for the receptionist to be able to select off this list. We will therefore need an additional task to enter and maintain information about meals and their prices. The client confirms that the cost of the order is just the total cost of all the meals

Dispatch driver: We need to know how a driver is chosen and determine what we need to record. Let’s assume we discover that the drivers are assigned to be on duty for various time units. Obviously, being able to maintain and print out duty rosters would be a useful thing to be able to do.

but that doesnt contribute to our main objective. It is agreed to leave the rosters outside the scope of the system for now. The receptionist will use information available independently of the database (probably a list of names pinned to a notice board) to determine who should be assigned to deliver an order.

Should the system provide a list of drivers for the receptionist to choose from? Why do we need the driver’s name? Well, clearly we need to be able to contact the driver for a specific order to check up on progress or make alterations. Maybe all we need is a name and a cell phone number. This is a good point to check with the client.

Pick up meals: We decided that the statistics are not going to differentiate times for picking up and delivering a meal, so we don’t need to record the times at every stage of the process.

Deliver meals: If we want to have statistics on overall delivery times, we clearly need to record the time that each meal is delivered. We don’t need to be concerned at this stage how that information gets into the database. The driver may ring the depot or write the time on a time sheet for entering later. At this stage, we are only concerned that the system is capable of storing the delivery time for each order. When the order is delivered, the receptionist also needs to know that the driver is free to take another order. We decided in the section about dispatching drivers that for now these decisions would be independent of the database. The receptionist would probably just make a manual note

Enter time sheets: We already have the driver’s name, information about the order, and delivery times recorded. Is there anything else we need to record at this step?

So we can summarize at the end that:

The system will record and provide information about meals and their current prices. It will maintain data about orders including the date, the meals requested, and contact information for the customer and the driver assigned to the delivery. It will also maintain the time the order was placed and the time it was finally delivered. Given this, the system will be able to provide summary information about the number and value of orders within particular time periods and also summaries of the time taken for total processing of orders. The system will not maintain any additional information about drivers nor about which drivers were associated with a particular order. The system will not maintain any information about outlets nor which were used for any particular order.

What Are the Input Use Cases?

Recall that use cases are simply textual descriptions of the ways users interact with the system. There are many different levels of use case from very high-level descriptions of objectives to very low-level tasks.

The most useful level for our purposes of trying to understand and describe a database system is the user task level.

Now that we have a clearer idea of the objectives and the scope of the system, we can return to our list of jobs that involve data entry and decide what interaction with the system needs to take place at each point.

We could consider combining all the tasks that involve data about an order into one use case

However, for this problem these tasks are all quite separate, performed at different times, and possibly by different people. It may not be possible to assign a driver to an order immediately, so entering the driver contact data should be a separate task from entering the order. Similarly, recording the delivery time is a separate task performed at a different time. Each of these tasks to do with updating an order are central to the whole business and will be repeated several times a day, so it is reasonable to consider providing each with its own use case.

We have enough data stored to be able to find orders with no driver contact number or no delivery time. Given that this information will be almost essential to the receptionist and it is readily available in the system, we will add locating uncompleted orders as a use case also

• Use case 1: Maintain meal data. Enter and update data on meals (ID, description, current price).

• Use case 2: Enter an order. Enter initial order information (order number, date, address, phone) and for each meal (ID). (This assumes prices do not change) Each meal must be one already in the system.

• Use case 3: Update order status. For a particular order already in the system, add driver contact number or delivery time.

• Use case 4: Report on order status. Retrieve all orders satisfying required status (e.g., no driver contact number or no delivery time).

What Is the First Data Model?

Now that we have some idea of the data we need to maintain, we can sketch a first data model for the problem. We clearly have data about at least two separate things, orders and the types of meals that can be supplied

The objects of the Meal class will be each of the types of meal that appear on the menus in a client’s motel or hotel room.

we have separated each of the pieces of data we are recording and put them as attributes in the most likely class.

Reading from left to right, we have that a particular order (e.g., “to Alexandria Street at 12:30 on 10/4/2022”) can involve one or more types of meal. From right to left, we have that each type of meal (e.g., Burger) could appear on many orders but may not appear on any (e.g., no one may ever want to order pizza).

Note that this model is only a first attempt and overlooks some important details.

What Are the Output Use Cases?

We now need to reconsider the required reporting and summarizing tasks in terms of the data we are keeping. We have already determined that it would be useful to report on orders awaiting a driver to be assigned or yet to be delivered and have included that in use case

Let’s think about the statistics on orders and delivery times that are part of our main objective. The statistics on orders can be found by considering the Order objects. We can find the value of each order by summing the prices of each meal associated with that order, given (for now) that prices remain constant. We can also determine the time taken for each order by subtracting the order_time from the delivery_time. By selecting those order objects that are in the date period we are interested in, we can determine different statistics about the times (e.g., averages or totals) during a particular week or month or whatever is required. We have enough information stored in our data model to satisfy the requirements of our main objective.

It is useful at this point to look at the data we are storing and see what other information can be deduced. Given the data we have, what other statistics could we supply?

It might be useful to ask your client whether, given the information is already stored and readily available, they would like to be able to know how much gross income came from pizzas, or how many people ordered curries, or if orders containing particular types of meals took longer to deliver

On closer thought, however, we might guess that information about particular meals (e.g., a Margarita pizza compared with cheese burger) may not be as useful as comparisons between different categories of meals (pizzas versus burgers). If this is the case, we maybe have a new attribute or class, Category.

Each meal could then be assigned a particular category. We will look more closely at whether something like a category should be an attribute or a class later on

Another usecase for manager

Use case: Summary reports on orders. (This assumes constant prices.)

For each completed order with a date in the required time period:

• Find all the associated meals and sum their value of price

• Calculate the time of the order by subtracting order_time from delivery_time.

• If required, group orders by smaller time period (day, week, etc.).

• Average and/or total prices/times.

More About Use Cases

There are no hard-and-fast rules about what use cases should include or how they should be presented. The overriding consideration is that they should be readable and provide a clear and complete description of what each task involves. We now have a closer look at some further aspects of use cases.

Actors

We use an actor as a representation of a user of our database.

In order to take into account all the different interactions our users might make with the database, it is useful to consider all the different types of people our users may encompass.

It is not necessary to become too concerned about which actors are associated with particular use cases. What is important is to consider the different roles of people likely to interact with the system and see the problem from the perspective of each.

Here are some broad categories of roles people might have with examples from our meal delivery service.

• Clerical/data entry operators: Users in this role deal with entering or updating raw data (e.g., entering order details or finding an order to enter a delivery time).

• Supervisors: Users in this role deal with day-to-day details. They may require lists of transactions, rosters, and so on. For our meal delivery database, these users would probably deal with things such as a list of which orders have not yet been delivered or details of specific orders to follow up problems.

• Managers: Managers are more likely to be interested in summaries rather than dayto-day details (e.g., the total number of orders for each day during the last week or the average time of delivery for today’s orders). They may also require very general summaries that show trends and which can be used for forecasting and strategic management decisions (e.g., value of orders per month over the last two years).

Summary

So thinking from the point of view of these different roles (or actors) can give a great deal of information about what the system will need to provide to be most useful. Also writing your use cases will help you a lot when you want to decide the relationships between each entity and will speed up the implementation process.

References

Beginning Database Design: From Novice to Professional