Use este identificador para citar ou linkar para este item:
https://www.repositorio.mar.mil.br/handle/ripcmb/847104
Título: | Online Large-Scale Hypothesis Tesng with Corrupted Data |
Autor(es): | Alves, Victor Benicio Ardilha da Silva |
Orientador(es): | Szechtman, Roberto Chen, Louis |
Palavras-chave: | False discovery rate Power Data corrupon Cascading effect |
Áreas de conhecimento da DGPM: | Engenharia de produção aplicada à pesquisa operacional e gestão da inovação |
Data do documento: | 2024 |
Editor: | Naval Postgraduate School (NVS) Naval Postgraduate School (NVS) |
Descrição: | This thesis examines the robustness of the Levels Based On Recent Discovery (LORD) algorithmwhenexposedtocorrupteddata,particularlywithincriticalreal-timeprocessing environments like the Brazilian Navy’s Blue Amazon Management System (SisGAAz). Ourstudyrevealsthatmaintainingtheintegrityofstatisticaltestingiscrucial,mainlywhere decision-makingdependsontheaccuracyofdataanalysisconductedonline. Ourresearchidentifiesandrigorouslyevaluateseffectivemitigationstrategiesagainstprobabilisticdatacorruptionscenarios.Keyfindingshighlighttherobustefficacyof“phantom” rejections and the strategic integration of the LORD algorithm with the online Benjamini andHochberg(BH)algorithm,avariationadaptedfromthetraditionalofflineBHmethod. These approaches, we assert, maintain testing power significantly, even under adversarial manipulations,instillingconfidenceintheireffectiveness. Weproposeacontrolledadversarialsetupinvolvingtwoentities:“Blue,”thedefenderwho aims to make true discoveries, and “Red,” the attacker focused on data corruption. Our analysis investigates several attack scenarios. The first is a singular anticipated attack that manipulatesthefirsttruediscoveryandtraditionallytriggersacascadeeffect,counteredby adjusting the decay rate of each test level to buffer against such disruptions. Additionally, we explore multiple p-value corruption scenarios where strategically placed “phantom” rejections can reclaim compromised testing power, although this strategy faces practical challenges due to the necessity of predicting attack probabilities. Lastly, indiscriminate attacks on any p-value show that integrating the LORD algorithm with the online BH algorithm is exceptionally effective, maintaining the algorithm’s robustness even amidst widespreadcorruption. The thesis concludes that while prevalent algorithms are adequate for handling FDR in trustworthydatascenarios,theireffectivenessdiminishesunderadversarialdatamanipulation, a common issue in real-time data environments. Our findings suggest that enhancing algorithmic robustness against data corruption supports reliability in statistical testing and contributes to broader research and application in adversarial conditions. We propose new avenues for future investigation, such as exploring data corruption impacts on other existing algorithms and developing a “pure” algorithm. This new algorithm could offer a more robustalternativetothecurrentmixedapproach,providingastrongerdefenseagainstdata manipulation. |
Tipo de Acesso: | Acesso aberto |
URI: | https://www.repositorio.mar.mil.br/handle/ripcmb/847104 |
Tipo: | Dissertação |
Aparece nas coleções: | Engenharia Naval: Coleção de Dissertações |
Arquivos associados a este item:
Arquivo | Descrição | Tamanho | Formato | |
---|---|---|---|---|
Dissertação - CC Benicio.pdf | 8,21 MB | Adobe PDF | Visualizar/Abrir |
Os itens no repositório estão protegidos por copyright, com todos os direitos reservados, salvo quando é indicado o contrário.