Here you will find the explanation of the Summary attribute type, which can be specified together with the associated tab of the Redefine Attribute or Insert New Attribute dialog. Please refer to the general text for topics applying to all attribute types, i.e. to all tabs, for example:
With a Summary you can define a so-called derived attribute whose value is calculated or derived based upon the values of one existing attribute. As a rule, the values of an attribute are summarized for all displayed objects. You can select from a series of predefined summaries, for example Sum, Average or Maximum. We define this type of derivation as horizontal calculation, in contrast to vertical calculation for formulas. Summaries already cover a large portion of simple analysis tasks.
InfoZoom provides two different methods to create summaries.
- Summary of one attribute
- Summaries of two attributes
Of this, there are to sub-types:
- The per attribute is an attribute with values
- The per attribute is a heading
When summarizing one attribute, the values of the referenced attribute of all the displayed objects are used for the calculation and the result is generally one single value. (In the case of the summary list, this single value can also be a list of values, but it is the same list for all objects.) This single value is assigned to the derived attribute for all objects of the table. (Exceptions: The summaries Previous, Next, Number, Repetitions and Accumulation normally result in different values.)
Note: A unique result (for example, the average) appears in the table in the Compressed Table view or Overview due to the InfoZoom mechanisms (identical, neighboring values are summarized) only in a cell that extends across the entire width of the table. Nonetheless, these are many individual values. This is important and may lead to a surprising result, for example if you apply the summary Sum to an average.
For the Summary of two attributes, the calculation for each different value of the per attribute is performed separately. This can lead to the creation of a different result for each different value of the per attribute. For all objects of the per attribute with the same value, the result is assigned to the derived attribute. The note above applies accordingly.
With the checked Use unique value checkbox, you specify that a sum or an average over the from attribute's unique values is calculated for the pro attribute's various objects.
A summary of several attributes can be created if you define a heading as per attribute. In this way you get a particularly flexible, possibly multidimensional analysis group.
The Following Predefined Summaries Are Available
- Count of A: Calculates the number of different values of attribute A.
- List of A: Creates the list of the different values of attribute A. The calculation mode Automatic can be defined here, however, a recalculation is not performed upon resorting the table, but only if the lists content is supposed to be changed.
- Sum of A: Calculates the sum of all values of attribute A.
- Accumulation of A: Calculates the sum of the values of A in regard to the current sorting (Note). The new attribute thus contains the following values: The first value of A, the sum of the first two values of A, the sum of the first three values of A, ... and finally as the end value: the sum of all values of A.
- % accumulation of A: Calculates the percentage of the accumulation of the values of A in relation to the total sum of all values in regard to the current sorting (Note). The new attribute contains the following values: The percentage of the first value of A in relation to its total sum, the percentage of the sum of the first two values, etc. ... and finally as the last value: 100%.
- Minimum of A: Calculates the smallest value of attribute A.
- Maximum of A: Calculates the largest value of A.
- Average of A: Calculates the average of the values of attribute A.
- Median of A: Calculates the median of the values of attribute A.
- Mode of A: Calculates the most frequent value of A. If this value is not unique, the result is undefined.
- Variance of A: Calculates the variance of the values of A. (In Excel: VARP)
- Standard deviation of A: Square root of the variance. (In Excel: STDEVP)
All previous summaries can be selected, similar in relation to a second attribute (or an attribute group), such as
Count of A per B; for example: Count of vehicle models per manufacturer.
- Previous value of A: Calculates the previous value of the selected attribute. Here, it is not the numerical previous value, but rather the previous value of according to the value list that is calculated. That means that if an attribute contains the even numbers (2, 4, 6,...) then the previous value of 4 is 2 and not 3.
- Previous value of A per B: Calculates the previous value of A according to how B is sorted. For example, the previous value (venue) by date in a Formula1 table of the venue of the previous race can be calculated.
If the per attribute B for a previous value of A contains no unique value, the result has the value undefined.
- Next value of A and next value of A per B: As in Previous.
- Number of A: Numbers the values of A according to the current sorting (Note).
Example: Sorting the watches table according to price (descending order). Inserting the summary Number(Price). The watches with the ten highest prices are found by zooming in on the values 1 to 10 in Number(Price).
- Number of ID: Numbers all displayed objects according to the current sorting (Note).
- Number of A per B: Numbers the values of A according to the value of B as determined by the current sorting.
- Ranking of A: Consecutively numbers the values of A (independent of the current sorting). The largest value receives the number 1, the second largest the number 2, etc. An empty value and values that do not conform to the defined format receive the result undefined.
- Ranking of A per B: Consecutively numbers the values of A (independent of the current sorting) per B. If different objects of A receive identical rankings for identical values of B, the next rank is increased accordingly. For example, if there are two third rankings, the next value receives the rank five, and not four. Example in Formula1: Sort all drivers according to total points Sum(Points) per Statistics. Define the attribute with ranking Ranking(Sum(Points) per Statistics) per Driver: The driver with the most points receives 1, the second best, 2, etc. Two drivers are at rank 11; the next driver receives rank 13.
Additionally for Summarizing an Attribute
- Repetitions of A: Calculates how many neighboring objects for the attribute A have identical values for the current sorting (Note).
- Percentile of A: For every value of A, the percentage of objects is calculated that have this or a smaller value. For the smallest value, this produces this value's percentage, for the second smallest value the sum of the two smallest percentage, etc. and for the largest value the result is always 100%. The empty value is always the largest. Example for Autos2000.fox: Percentage(Total consumption) calculates for each how many of the displayed automobiles have this or a smaller value.
Additionally for Summarizing Two Attributes
- % Percentage of A per B: Calculates the number of different values of A per value of B in relation to the total number of different values of A.
- % Percentage of ID per B: Calculates the number of objects per value of B in relation to the total number of objects.
- % displayed of A per B: Calculates the number of currently displayed different values of A per value of B in relation to the total number of different values of A per value of B.
- % displayed of ID per B: Calculates the number of currently displayed objects per value of B in relation to the total number of objects per value of B.
- % sum of A per B: For each value of B, determines the percentage of the sum of the values of A relative to the total sum of A. This is an abbreviation for the formula 100 * [sum(A) pro B] / [sum(A)] and the two included summaries sum(A) per B and sum(A). Adding the resulting value for this summary (once) for every value of B returns the sum 100%.
- % average of A per B: For each value of B, determines the percentage of the average of A in relation to the average of all values of A.This is an abbreviation for the formula 100 * [average(A) per B] / [average(A)] and the two included summaries average(A) per B and average(A).
See example for Summaries with Percent.
Note regarding summaries that depend upon the current sorting: The current sorting in the Overview view is derived from the object sorting on the basis of the attributes of a coupling if the summary is part of a coupling. Otherwise, the current table sorting is not visible here and it would be better to define the summary in the Compressed Table view. Here, the common sorting of all table objects is considered. The calculation mode of such summaries can only be set to After data loading, Manual (F9) or Once (on definition). The calculation mode Automatic cannot be chosen since this can easily lead to contradictions if, for example, you sort according to the derived attribute itself.
Special case: The empty value sometimes has a special status: For Count, List, Mode and Percentage, during entry it is not considered, and for Sum, %Sum, Accumulation and %Accumulation it counts as zero. For Number, Repetitions, % Percentage, and % Displayed the empty value is treated like every other value and for Ranking it leads to the result undefined. For all other summaries it can be specified using the Interpret missing values as 0 option how it should be treated. Illegal values are treated as empty values. Illegal values are those values that cannot be interpreted in the specified Format of the attribute.
Warning: For summaries with two attributes, it does not make sense (except for Previous, Next and Accumulation) to select an attribute as per attribute that has different values for all table objects. (For example, the summary of Sum, Average, Maximum and Minimum produces a result identical with the starting attribute.) Since the first attribute of each table, the ID, always has different values for all objects, an error message results: "The ID is not permitted as second attribute". This error message does not appear for other attributes since this characteristic may change later (through editing or redefinition of the per attribute).
- Recalculation. Summaries are calculated depending on their calculation mode, e.g. either automatically (dynamically) or manually. Use the Recalculation option to determine under which circumstances recalculation shall be carried out.
- With the option Interpret missing values as 0 you can specify how empty and illegal values should be treated.
- For the summaries Previous or Next you can specify a particular value for the first or last value under the option Boundary for Previous or Next. By default, when the associated edit field is empty, the value undefined is used.
- Reverse direction. For the summaries Ranking, Percentage, Accumulation and %Accumulation, the processing direction through the objects can be reversed with this switch. The summary Ranking, for example, assigns the rank 1 to the lowest value.
Define or Redefine Attribute (General Part)
Simple Attribute and Duplicates
Linking with a Different Attribute
Derived Attribute Using Aggregation
Derived Attribute Using Formula
Derived Attribute Using Classification
Derived Attribute Using Case Differentiation
Insert Attribute Group