View Full Version : Importing data file
vnigam
July 10th, 2007, 11:59 AM
I am trying to do a weibull fit using a data file with 200,000+ lines of data. When I import the data it shows only 5531 data. Is there a way by which all the data can be used for the analysis.
Also, I want to know if there is a way to segregate a mixed distribution and get the parameters?
Thanks for your help!
Pantelis
July 10th, 2007, 02:35 PM
I am not sure what you are importing or how its structured... However, note that the spreadsheet that you are importing to is limited to 64000 rows so, that is that max line entries that you can make. Note however, that a single line entry line can represent more than one data point (i.e. 3 failed at 100, 1000 were suspended at 120 and so forth), but this requires grouping your data.
Now, the simplest (quickest) way I can recommend to do this is (and automatically group them also) is to import all of them Excel (you will need a version of Excel that has more than 64000 rows - I think Excel 2007 or higher) and then partially copy and paste them into Weibull (in groups of less than 60K) . Each time you paste them in use the Weibull++'s auto group utility to automatically group the pasted data and to free space for more entries.
Pantelis
July 10th, 2007, 02:36 PM
I missed the second question... but I am unclear as to what you want to do... If you do a mixed weibull analysis you get parameters for each pop. What are you trying to segregate?
vnigam
July 11th, 2007, 05:54 AM
I have data file with mileage on cars. A part of the data is the mileage at which failure occured and the others are censored i.e. cars still running with no failures. I tried to run a distribution analysis using minitab and found that on a weibull probability plot part of the data were on the straight line and part of it had a slightly different pattern. My guess is that a part of the data has a distribution with different parameters. May be a mixed failure mode! I tried to split the data and do an analysis, but that is becoming very time consuming. My question is that, is it possible for Weibull++ or any other product to do this for me i.e. group the data to get accurate estimates of the parameters for the groups it identifies?
Regarding the first question, I do not have a version of Excel which can handle such a large data file. However, minitab could handle the data file but the reliability tool is not as versatile as Weibull++.
On my data file taht I have on word pad as a text document, I have two columns of data one indicating the time value(mileage) and the other F or S.
Do you have any suggestions?
Thanks
Arai.M
July 11th, 2007, 11:58 AM
As Pantelis mentions, only 64000 rows can be used in Weibull++. I am unsure as to why your data would show 5531 rows. Please contact support on this issue (SupportUSA@ReliaSoft.com (SupportUSA@ReliaSoft.com)).
You would still have issues with 200K+ items so I suggest to repeat the following process 4 times: Copy and paste 50K rows into Weibull++ and autogroup data (menu Data\Auto Group Current Data). This alone might be enough to reduce your data to what is required (it will take all identical values and group them together). If not, you might have to round values (or to be more correct use intervals). Once you have all your data in one folio, you should be able to select Weibull dsitribution and "mixed" as an option. You will then be requested to select the number of populations you would like to use. From what you are saying it seems to be 2. Weibull++ will then fit 5 parameters to your data (2 Betas, 2 Etas and a proportion of units belonging to one population versus the other) for more info see http://www.weibull.com/LifeDataWeb/the_mixed_weibull_distribution.htm (http://www.weibull.com/LifeDataWeb/the_mixed_weibull_distribution.htm).
Often times, data like the one you mention, benefits from doing this analysis from the usage perspective. Most times, while failures are stated in terms of usage (miles), suspensions or censored data might be in terms of time (i.e. 3 years warranty). For more information on usage based analysis, see http://www.weibull.com/hotwire/issue73/hottopics73.htm. Keep in mind that you might still see mixed behaviors with this approach and should be able to use a mixed Weibull model here as well.
vnigam
July 11th, 2007, 01:12 PM
:) Thank you very much! Arai. Your thoughts are very valid and useful.
Is there a way by which I could see a plot of the mixed distribution so that I can see how close my data points lie with respect to the fitted line?
Thanks again!!
Vandana:)
Arai.M
July 11th, 2007, 01:51 PM
In the same way you would see a standard weibull probability plot and depending on the parameter estimation method used http://www.weibull.com/LifeDataWeb/parameter_estimation.htm. Click on menu Data\Plot or on the first button on the control panel in the right. You will most likely not see a straight line for the model.
Keep in mind that if you are using MLE as a parameter estimation method (which is advisable when the data is heavily sensored and the sample size is large) the fit of the points to the line will not give you a sense of fit to the model (since MLE is not trying to fit a line through points like in Rank Regression and the points are placed in the graph for reference only). In that case, you would look at the LK (likelihood) value. The greater that LK value, the better the fit (e.g. LK Value= -134 using 2P-W and LK Value = -137 using mixed Weibull - 2 pop indicates 2P Weibull is a better fit)
vnigam
July 12th, 2007, 05:42 AM
Thanks a lot. I appreciate your time and resource!
Vandana:)
vBulletin® v3.8.4, Copyright ©2000-2010, Jelsoft Enterprises Ltd.