Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: extract values from kdensity graphic
From
"Seed, Paul" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: extract values from kdensity graphic
Date
Thu, 3 May 2012 17:24:21 +0100
Dear Statalist,
As Nick points out, this is becoming quite a complex problem.
I actually would not use -kdensity-, as it does
not capture the essential features of Mike's original data set.
A simpler approach is to look at the differences between successive values,
and declare a new group whenever the gap is large (for a suitable value
of "large"). This can be quite easily done in version 8.
***** Begin example **********
* Enter Mike's data set
set more off
clear
input sampling_event size
1 94.74
2 94.89
3 94.95
4 94.97
5 95
6 95.05
7 95.08
8 96.11
9 96.22
10 96.24
11 96.27
12 96.27
13 96.27
14 96.32
15 96.34
16 97.19
17 97.26
18 97.26
19 97.32
20 97.34
21 97.39
22 98.41
23 100.62
24 100.69
25 100.69
26 100.76
27 100.76
28 100.76
29 100.84
30 100.91
end
list
twoway (scatter size sampling_event)
* Indentify groups
sort size
gen step = size -size[_n-1]
* Use -stem- to quickly assess the step sizes
stem step
* In the example, steps are all <=0.1 or >= 0.85
* I declare a new group for any step > 0.5
* I could change this depending on the data set
gen group = step >0.5
replace group = sum(group)
* Check groups are well defined
bys group : su size
* Graph the various groups in different colours
graph twoway (connected size sampling_event if group == 1) ///
(connected size sampling_event if group == 2) ///
(connected size sampling_event if group == 3) ///
(connected size sampling_event if group == 4) ///
(connected size sampling_event if group == 5)
* That looks good
* Now try out -kdensity-; pick up the plotted values in x and d
kdensity size , w(0.1) n(30) gen(x d)
graph twoway (connected d x if group == 1) ///
(connected d x if group == 2) ///
(connected d x if group == 3) ///
(connected d x if group == 4) ///
(connected d x if group == 5)
* kdensity just does not seem to capture the groups I see in the simple scatter plot.
********** End example **************
Paul T Seed, Senior Lecturer in Medical Statistics,
Division of Women's Health, King's College London
Women's Health Academic Centre KHP
020 7188 3642,
[email protected],
http://www.kcl.ac.uk/medicine/research/divisions/wh/about/people/seedp.aspx
Please do not send unencrypted un-anonymised data to this address.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/