DSpace Collection:
http://hdl.handle.net/1942/1259
2014-06-19T11:55:45ZThe effect of physical training on maintenance and recovery of the functional muscular strength in rats injected with EAE
http://hdl.handle.net/1942/1260
<h5>Title</h5>The effect of physical training on maintenance and recovery of the functional muscular strength in rats injected with EAE
<h5>Authors</h5>EKWEMPE EBWEKOH, Clifford
<h5>Abstract</h5>The aim of this report was to study the effect of physical training on
maintenance and recovery of the functional muscular strength in rats injected with
Experimental Allergic Encephalomyelitis. For this effect 25 rats from different mothers
and of the same gender were randomly selected. The 25 rats were again randomly
assigned to two treatments groups. One group treated with EAE (13 rats) and the rest
assigned to the control group (12 rats). Each treatment group was subdivided into two:
swimmers and non swimmers.
The data revealed a longitudinal set up that is repeated measures were performed on each
rat over time (25 days) and to have an idea of the trend of the average behavior of the
population of rats, a mean profile was plotted considering the time spent on the Rotorot
as the main response. The profile revealed a non constant pattern of the evolution trend.
Further, to determine the co-variance structure, a variance function was plotted and it also
revealed a non constant pattern suggesting and unstructured type.
Methodology: To capture the flexibility of the mean profile, a MACRO was
implemented to generate fractional polynomial time variables. The obtained fractional
time variables were used in PROC MIXED statement in SAS with the repeated option.
Conclusion: We found that on an average point of view, Experimental Allergic
Encephalomyelitis (EAE) has an effect that is; rats that were subjected to EAE had a
different evolution pattern over time as compared to the ones in the control group. We
also discovered that physical (training) has an effect on the functional strength of the rats.
According to the results, rats that were subjected to EAE and who were undergoing
physical training spent more time on the Rotorot than rats that were not subjected to
physical training.2006-01-01T00:00:00ZPeabody picture vocabulary test - Revised data : a Bayesian approach to item response theory
http://hdl.handle.net/1942/9228
<h5>Title</h5>Peabody picture vocabulary test - Revised data : a Bayesian approach to item response theory
<h5>Authors</h5>Arima, Serena
<h5>Abstract</h5>Background: Item Response Theory is the area of psychometry that deals with the problem of constructing and analyzing psychological and sociological tests. By applying a fully Bayesian approach to this methodology, we analyze a data set obtained administering the Italian translation of the well-known Peabody Picture Vocabulary Test - Revised (PPVT-R) to a sample of Italian children. In the original English version the items are believed to be in increasing difficulty order. One main aim of this thesis is to evaluate if and how much the translation leads to violations of the increasing difficulty ordering. This aspect is important since, in the original version, basal and ceiling level of the test are determined assuming items in increasing difficulty order. Methods: Classical item response models, as 1PL and 2PL have been applied to PPVT-R data. These models have been extended by including covariates. Parameters estimation has been performed using a complete Bayesian approach that in this caseresulted more ﬂexible than the classical approach. In particular, the ﬂexibility of the Bayesian approach has been underlined with respect to the analysis of an incomplete data matrix, due to the nature of the stopping rule, and to the item diﬃculty comparisons.
Several decision rules for the comparison of item diﬃculties have been analyzed. We propose a further more general alternative method that allows to compare item characteristic curves taking into account also the ability distribution. Results and conclusions: 1PL, 2PL models and model with covariates have been estimated using MCMC methodology: the goodness of fit of the models has been analyzed using posterior predictive p-values and the performance of the three models has been compared using AIC, BIC and DIC indices. The model with covariates resulted to be the best in terms of information criteria. Therefore the comparisons of the item difficulties were performed using this model. The different decision rules for the item difficulty comparisons have been compared and an ordering of the items for each criterion has been drawn. The criteria agree on concluding violation of the increasing difficulty order: from the analysis of the results, we can conclude that the test can be improved by modifying the ordering and by translating the English terms in Italian words of more common use.2006-01-01T00:00:00ZThe EM-algorithm for modeling Serial Analysis of Gene Expression (SAGE) data
http://hdl.handle.net/1942/3650
<h5>Title</h5>The EM-algorithm for modeling Serial Analysis of Gene Expression (SAGE) data
<h5>Authors</h5>AMPE, Michèle
<h5>Abstract</h5>Serial Analysis of Gene Expression (SAGE), a technique that has been developed at Johns
Hopkins University in the USA, allows the analysis of overall gene expression patterns. It
is an open platform because SAGE does not require a preexisting clone, unlike microarrays.
So SAGE can be used for the identification and quantification of known genes as well as new genes. A SAGE experiment, from a statistical points of view, consists of the following 7 steps: 1. Extract a sample of mRNA fragments from a biological sample.
2. Convert the mRNA fragment into cDNA clones. 3. Generate tags by cutting 10 or 17 base long segments from a certain site of cDNA. These tags are what we call the true tags. 4. Apply the PCR (Polymerase Chain Reaction) procedure to boost the counts of the tags. 5. Link the tags to form long sequences. 6. Take a sample of those sequences. 7. Read off tag counts by sequencing these chosen sequences. The resulting tags are called sequenced tags and the resulting counts are the observed counts.
Note that no true tags are lost before, during or after sequencing, hence the number of se-
quenced tags is equal to the number of true tags. In the following sections we will assume
that the true tags uniquely identify mRNA fragments that are present in the biological sample. The result of a SAGE experiment, called a SAGE library, contains the observed counts. Hence a SAGE experiment can only measure the expression levels of the tags. We can get the gene expression levels from a SAGE library by mapping the tags onto the genes. The aspects of SAGE experiments that bias the outcomes have been studied by simulating libraries by Stollberg et al. (2000). The following four sources of errors are considered: (1) sampling errors in tag selection; (2) sequencing errors;
(3) non uniqueness of tag sequences; and
(4) non randomness of DNA sequences.
The authors have provided a maximum likelihood approach to estimate the number of unique transcripts and their frequency distribution. In what follows, we will focus on sequencing errors. Sequencing errors have a large impact on the outcome of a SAGE experiment: non-existing tags may be introduced at low abundance and the real abundance of the other tags may decrease.
Colinge and Feger (2001) introduced an approach to identify tags whose abundance is biased by sequencing errors. Their approach is based on a concept of neighbourhood, i.e. abundant tags can contaminate tags whose sequence is very close. They assume constant error probabilities and use matrix inversion to correct for sequencing errors.
There are also more biological approaches to the problem of sequencing errors as in Blades et al. (2004a,b). In Blades et al. (2004a), the fact that frequency distributions of tags display a regularity across cell types and species is used to: • automatically discount low counts that are not reliable for the comparison of expression levels across conditions for a specific gene; • to transform the tag counts to a scale that provides a more reliable correlation and clustering of genome-wide expression profiles. They state that the transformation enhances the ability to distinguish between signal and noise in SAGE data. Blades et al. (2004b) observed a linear relationship between the copy
number of a given tag and the number of observed tags which differ from the given tag by a single base. By transforming the slope of this relationship, an estimate of the sequencing error rate can be found. Akmaev and Wang (2004) estimated error rates based on a mathematical model that includes the PCR and sequencing error contributions. About 3.5% of Long SAGE tags (10-17 base pair tags) will inherit errors from the PCR amplification and 17.3% of the Long SAGE
tags will have sequencing errors. Beissbarth et al. (2004) introduced a statistical model for the propagation of sequencing errors and proposed an Expectation-Maximization (EM) algorithm to correct for the sequencing errors given a library of observed sequences and base-calling error estimates. The
suggested correction method adjusts the tag counts to be closer to the true counts and the
bias introduced by the sequencing errors can be partly corrected. In the article, they make
use of the sequence neighbourhood of SAGE tags. This means that they assume that sequencing errors can only come from the first order neighbours tags. First order neighbours tags are tags that differ from each other by only 1 nucleotide, e.g. AAAA and AAAC are first order neighbour tags. The authors simulate the true tag counts by sampling from a Poisson distribution with mean pλ, with p the proportion of a tag in the library and λ a parameter for setting the size of the library. An observed tag sequence is generated from a true tag sequence using the simulated quality values (given by a base-calling program and in function of the probability of
a base-calling error) of the true tag sequence as the multinomial probabilities, i.e. replacing
each base with either one of the three bases with the probability specified by the sequencing quality value of that base. The counts of the observed tags are then summed to represent the observed tags. The implementation of the algorithm is done in R.
We also propose a statistical model for the propagation of sequencing errors in the case
that we have multiple SAGE libraries and correct for the sequencing error through an EM
algorithm by using a similar strategy as Beissbarth et al. (2004). We use MATLAB for the implementation. There are, however, some differences between our method and the one developed by Beiss-brath et al. (2004). We assume that the true tag counts follow a multinomial distribution with parameters π and N, where π is the vector of probabilities that represent the relative expression levels of the DNA fragment and N is the number of true tags. The error estimates which we propose are partly based on the estimate given in Akmaev and Wang (2004). Another difference is that we assume that the sequencing errors are such that a tag can be misread as one of all possible tags, instead of only restricting this to the first order neighbours. Finally, in paper of Beissbarth et al. (2004), they work with Long SAGE sequences, while we work with sequences of four base pairs because we do not use the restriction of the first order neighbours. In section 2, we explain the notation and the settings that we will use throughout this thesis. In section 3, we give a detailed mathematical description of the EM algorithm with the expressions for the estimates of the expression probabilities π and the corresponding Variance-Covariance matrix. In section 4, we simulate SAGE libraries to study the following: • the potential gain in terms of bias when we use estimates obtained by the EM algorithm instead of the observed expression probabilities; • the potential gain in terms of bias when we use multiple libraries instead of a single library;
• the effect of the probabilities of sequencing errors; • the comparison of the bias using our method and using the method of Beissbarth et
al. (2004). The results of the simulations are given in section 5.2007-01-01T00:00:00ZDefaulters in a cohort of HIV infected patients
http://hdl.handle.net/1942/3406
<h5>Title</h5>Defaulters in a cohort of HIV infected patients
<h5>Authors</h5>AKINDUNJOYE, Oluwaseyi
<h5>Abstract</h5>The advent of antiretroviral therapy (ART) has transformed HIV/AIDS from a
primary deadly disease into a chronic disease characterized by enhanced quality of
life and increased life expectancy. Once started, the antiretroviral treatment should be
continued lifelong and adherence to this treatment should be nearly perfect to enable
long-term efficacy. Despite the improvements in management in Europe, HIV
infected persons still remain vulnerable to drop out/loss to follow up from care and
treatment. Therefore, this research tries to review the defaulter rate during the last five
years (2002-2006) at the HIV outpatient clinic and how it is evolving during this
period.
Data was explored using Kaplan-Meier curve to know whether or not the groups are
proportional through the assumption of proportional hazards i.e. if the estimated
survival functions for two groups of survival data are approximately parallel (do not
cross) and a Logrank and Gehan-Wilcoxon tests were used to compare the survival
estimates between two or more groups. The Survival data is modeled using Cox’s
proportional Hazard model to explore the relationship between survival and
explanatory variables thereby analyzing for the effect of several risk factors on
survival.
Collett’s approach criterion was applied to select the best model ignoring the
missingness mechanism in the data. In order to assess the adequacy of the fitted
model, residual plots and some formal tests (time-dependent covariate) were used to
check for the assumption of proportional hazard. A stratified analysis was carried out
to compare the Cox’s proportional hazard model and the stratified model which tells
us if the fitted model is good or not. The use of single and multiple imputation
methods were used to investigate the nature of the missingness mechanism and its
effect, observing the possibility of Missing at Random (MAR).
Based on this study, there was an association between defaulters and gender, risk
group, clinical stage, sex preference, origin group, ART, viral load and age group,
also an evolution over time shows how the patients default through an increasing
trend. Our analysis showed that 13.97% of 1167 patients defaulted while 8.94% were
lost to follow-up but out of 163 defaulters, more than half of them defaulted in 2006
alone (53.99%) while 3.68% defaulted in the first year of ART. Of all the patients,
67.15% were male but in proportion, female defaulted more than male with 19.84%
and 13.05% of female were lost to follow-up. It was observed that the model does not
satisfy the proportional hazard assumption even with time dependent covariates, the
global goodness of fit test shows that the model is fitted at a borderline significant
level but correcting for missingness, the PH assumptions hold in both responses.2007-01-01T00:00:00Z