This documentation is in a rudimentary form for release 0.1.1. which is meant to see how much interest (not the financial one) this package generates.
The following vignettes are available.
On https://github.com/vanzanden/ggsolvencyii/tree/master/vignettes less rudimentary versions might be available between releases.
It will be very helpful to have seen a few examples of what ggsolvencyii can do before going through this vignette.
a typical spreadsheet might show some ORSA (own risk and solvency assessment) in the shape represented by the following data.frame:
id | time | ratio | SCR | BSCR | operational | life | market | l_expenses | l_CAT | m_equity | and so on |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2017 | 230 | 100 | 80 | 25 | 33 | 50 | .. | .. | .. | .. |
2 | 2018 | 225 | 103 | 85 | 25 | 33 | 57 | .. | .. | .. | .. |
3 | 2019 | 227 | 107 | 90 | 23 | 37 | 60 | .. | .. | .. | .. |
.. |
One can discern several parts. The first columns are id of each SCR composition and its ‘meta’ attributes (time, ratio). The further columns describe the components of each SCR item. The value of each item is in the crossing of its corresponding column and row.
ggplot2
, the foundation on which the plotting part of
this package is build expects data in a tidyverse format. Each row in
the data describes only one data point i.e. value of SCR item for one
specific ‘id’.
the following code is used from transferring data (for example 2, a single SCR plot) in a spreadsheet the same form as the “human format” as above to tidyverse format (the numbers differ though !)
data <- readxl::read_xlsx(path = "path/filename.xlsx",sheet = "ex2_data")
data <- tidyr::gather(data,
key = description,
value = value,
-id, -time, -ratio)
sii_z_ex2_data <- data.frame( time = as.numeric(data$time),
ratio = as.numeric(data$ratio),
description = data$description, # it has to be a factor !!
value = as.numeric(data$value),
id = data$id
when the above data is passed to the package with (a very) basic line as
ggplot() + geom_sii_risksurface(data = sii_z_ex2_data , mapping = aes(x=time, y = ratio, id=id, value = value, description = description))
a lot happens under the hood. Broadly speaking the next steps are
taken for geom_sii_surface
and .._outline
:
1. when `geom_sii_riskoutline` is used for comparison of id's, risk-values are moved between data rows
2. the structure of the SCR composition a expanded with grouping information
3. the expanded structure is integrated with the data
4. actual grouping is performed by adding rows
5. for all elements to be plotted the corner-coordinates of the circle segments are calculated
6. when applicable rotation and/or "squarification" is applied by changing the corner-coordinates
7. corner coordinates are transformed in a series of points for polygons
geom_sii_riskoutline
plots (some of) the outlines of
circle segment and as such can be used for a non-obtrusive plot, or for
an overlay of the composition of one SCR over the other (see use in
vignette showcase
. To prevent the need of working with two
separate datasets the optional aesthetic comparewithid
is
present in geom_sii_outline
. It is best explained with an
example. Compare the data of sii_z_ex1_data
with the
expanded structures without and with use of the
comparewithid
-aesthetic. It shows that the structure of id
= 1 is not plotted anymore at its own location (2016,230) but three
times in 201: Value 23 for SCR is now present three times in the data.
This transformation is used for all (sub)risks.
## the original data
sii_z_ex1_data[sii_z_ex1_data$description == "SCR", ]
#> time ratio description value id comparewithid
#> 1 2016 230 SCR 23.00000 1 NA
#> 2 2017 233 SCR 23.14993 2 1
#> 3 2018 238 SCR 19.99461 3 2
#> 4 2019 243 SCR 15.61773 4 3
#> 5 2017 231 SCR 19.60600 5 1
#> 6 2018 232 SCR 25.74336 6 5
#> 7 2019 232 SCR 21.91342 7 6
#> 8 2017 227 SCR 25.08169 8 1
#> 9 2018 225 SCR 22.43068 9 8
#> 10 2019 226 SCR 21.91607 10 9
#> without passing the aesthetic 'comparewithid`: 10 lines of data
#> description id x y value
#> 35 SCR 1 2016 230 23.00000
#> 34 SCR 2 2017 233 23.14993
#> 33 SCR 3 2018 238 19.99461
#> 31 SCR 4 2019 243 15.61773
#> 39 SCR 5 2017 231 19.60600
#> 38 SCR 6 2018 232 25.74336
#> 32 SCR 7 2019 232 21.91342
#> 36 SCR 8 2017 227 25.08169
#> 37 SCR 9 2018 225 22.43068
#> 40 SCR 10 2019 226 21.91607
#> and with passing passing the aesthetic 'comparewithid': 9 lines of data
#> description id x y value
#> 28 SCR 2 2017 233 23.00000
#> 31 SCR 3 2018 238 23.14993
#> 32 SCR 4 2019 243 19.99461
#> 29 SCR 5 2017 231 23.00000
#> 33 SCR 6 2018 232 19.60600
#> 34 SCR 7 2019 232 25.74336
#> 35 SCR 8 2017 227 23.00000
#> 30 SCR 9 2018 225 25.08169
#> 36 SCR 10 2019 226 22.43068
The foundation of the package is the structure. A representation of
the buildup of the SCR from its risks and subrisks. This structure is
applied as a data.frame passed as a parameter to the geom’s
geom_sii_surface
and geom_sii_outline
. The
default data.frame is sii_structure_sf16_eng
where ‘sf16’
stands for the standard formula as of 2016, and ‘eng’ for English
descriptions.
head(sii_structure_sf16_eng, 15)
#> # A tibble: 15 × 3
#> description level childlevel
#> <chr> <chr> <chr>
#> 1 SCR 1 2
#> 2 BSCR 2 3
#> 3 operational 2 <NA>
#> 4 Adjustment-LACDT 2d <NA>
#> 5 BSCR_div 3d <NA>
#> 6 market 3 4.01
#> 7 life 3 4.02
#> 8 non-life 3 4.03
#> 9 health 3 4.04
#> 10 cp-default 3 <NA>
#> 11 intangibles 3 <NA>
#> 12 market_div 4.01d <NA>
#> 13 m_interestrate 4.01 <NA>
#> 14 m_equity 4.01 <NA>
#> 15 m_property 4.01 <NA>
A Dutch version, sii_structure_sf16_nld
, is present in
the package.
The hierarchy of the elements in description
is
determined by level
and their components
(childlevel
). SCR has a mandatory level (character value)
“1”. rows with a suffix ‘d’ indicate a diversification item.
For other localizations or for use with internal models another
structure can be passed to the geom. see my interpretation of the
Internal Model of the dutch insurer “nationale nederlanden” in
sii_z_ex6_structure
. Changing level-numbering or
descriptions of items leads possible to the need of changing other
(parameter) files as well (i.e. levelmax, plotdetails,
coloring-sets).
When reporting the SCR composition of a large insurance company many
risks will be present. This can lead to a very cluttered plot where all
information is present but which is difficult to interpret. The package
provides the means to restrict the amount of items to ‘k’ (in general or
for each level separately) by means of the parameter
levelmax
. this can be an integer, to applied to all items
or in the form of a data.frame. The default value is 99, only grouping
for risks with more than 100 sub-risks….
Parameter levelmax = sii_levelmax_sf16_995
shows all
higher levels (lower level numbers) but restricts the lower levels
(higher numbers) to 4 individual risks and 1 grouping of the smallest
risks in that level.
sii_levelmax_sf16_995
#> # A tibble: 8 × 2
#> level levelmax
#> <chr> <dbl>
#> 1 1 99
#> 2 2 99
#> 3 3 99
#> 4 4.01 5
#> 5 4.02 5
#> 6 4.03 5
#> 7 4.04 5
#> 8 5 5
Combining the structure and the levelmax-information leads to an expanded structure of which the lines for levels 3 and 4.01 are shown here:
#> # A tibble: 15 × 4
#> description level childlevel levelmax
#> <chr> <chr> <chr> <dbl>
#> 1 market 3 4.01 99
#> 2 life 3 4.02 99
#> 3 non-life 3 4.03 99
#> 4 health 3 4.04 99
#> 5 cp-default 3 <NA> 99
#> 6 intangibles 3 <NA> 99
#> 7 market_div 4.01d <NA> 99
#> 8 m_interestrate 4.01 <NA> 5
#> 9 m_equity 4.01 <NA> 5
#> 10 m_property 4.01 <NA> 5
#> 11 m_spread 4.01 <NA> 5
#> 12 m_currency 4.01 <NA> 5
#> 13 m_concentration 4.01 <NA> 5
#> 14 m_illiquidity 4.01 <NA> 5
#> 15 market_other 4.01o <NA> 99
The row with level 4.01o
is the added row. The
description is derived from the row where childlevel
= 4.01
and the value of the parameter aggregatesuffix
(default
value is “other”).
The data (in tidyverse format!) is combined with the expanded
structure by means of a left-join on the side of the data. Because the
data is not expected to have o
-lines for integration they
will not be present in the merged table. When a possible
grouping line is present in the expanded structure a check is conducted
whether the data contains so much risks for that level that
actual grouping is needed. (The dataset can contain less risks
than the structure which is used; i.e. a pure life-insurance company can
use the standard sii_structure_sf16_eng
without any
problems)
Now it’s known which lines in the expanded structure/data-data.frame
should be plotted it is time to convert the date into circle segments.
For the data-row with the largest SCR value it is defined as a full
circle with radius = 1whatever the values of x and y. When combining
several calls to geom_sii_risksurface and/or _riskoutline the parameter
maxscrvalue
overwrites this extracted value. All
plot-elements are scaled to the surface value of the item. additional
manual horizontal and vertical scaling is possible, depending on the
range of x and y values of the axes to retain the round shape.
For other levels the circle segments are defined by an inner and outer radius and a number of (compass-)degrees of the first and last radial line (clockwise). the inner radius is defined by the outer radius of the next higher level. the number of compass-degrees is defined by the fraction of the value of each item and its (equal leveled) ‘peers’. The value / surface dictates the outer radius.
When applicable a rotation is performed, a rotation in such a way that the first radial line of a specific (sub)risk point to 12 ’o clock, and/or an added fixed rotation.
A final transformation to a squared form is possible. to keep surfaces correct the ‘radial’-lines are adjusted. This might lead to unpredictable results in combination with a rotation which is not a multiple of 45 degrees or description-based rotation.
The (transformed/rotated) corner points are translated in polygon
points (for geom_sii_risksurface
) or line segments (for
geom_sii_riskoutline
)
The final step is to define which of all these polygons or line
segments actually will be plotted. By default everything will be plotted
but passing a dataframe to parameter plotdetails
can
determine this on a level
-level or a
description
-level.
In the showcase two data-frames are used, only differing in column
surface
, but equal for outline1 to outline13. one of them
is shown here.
sii_z_ex1_plotdetails
#> levelordescription surface outline1 outline2 outline3 outline4 outline11
#> 1 1 TRUE NA TRUE NA NA TRUE
#> 2 2 TRUE TRUE NA TRUE NA NA
#> 3 2d TRUE NA NA NA NA NA
#> 4 3 TRUE TRUE TRUE TRUE NA NA
#> 5 3d TRUE NA NA NA NA NA
#> 6 4.01 FALSE NA TRUE NA NA TRUE
#> 7 4.01d FALSE NA NA NA NA NA
#> 8 4.01o FALSE NA TRUE NA NA TRUE
#> 9 4.02 FALSE NA TRUE NA NA TRUE
#> 10 4.02d FALSE NA NA NA NA NA
#> 11 4.02o FALSE NA TRUE NA NA TRUE
#> 12 operational NA TRUE TRUE TRUE NA NA
#> 13 cp-default NA TRUE TRUE TRUE NA NA
#> outline13
#> 1 TRUE
#> 2 NA
#> 3 NA
#> 4 NA
#> 5 NA
#> 6 TRUE
#> 7 NA
#> 8 TRUE
#> 9 TRUE
#> 10 NA
#> 11 TRUE
#> 12 NA
#> 13 NA
surface
is used by geom_sii_risksurface
,
the other columns by geom_sii_riskoutline
. It can best be
read as follows. for each risk the line of the corresponding
level
is used, possibly overrule by the line with the
correct description
and a explicit TRUE
or
FALSE
present.