ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (143,357)
  • 2020-2024  (3,057)
  • 1990-1994  (140,300)
  • Computer Science  (53,769)
  • Process Engineering, Biotechnology, Nutrition Technology  (50,549)
  • Geography  (39,039)
Collection
Years
Year
  • 1
    Publication Date: 2024
    Description: 〈p〉〈em〉〈span〉This study delves into the management of electronic waste (e-waste) stemming from the disposal of personal electronic items and mobile phones, primarily in response to the remarkable surge in the utilization of these devices within the Hassan city populace in recent years. The principal objectives revolved around investigating the existing disposal methods for electronic devices including mobile phones and collecting fundamental data concerning their disposal practices within the geographical confines of Hassan city of Karnataka State. Additionally, an endeavor was undertaken to gauge the level of awareness among respondents regarding the potential hazards posed by e-waste. It was observed that a significant proportion of the Hassan population typically retains especially electronic devices is cell phones, once these devices become outdated and obsolete. Among the prevalent disposal methods, the most widespread approach involves selling these gadgets to scrap dealers or junk shops, whereas recycling practices remain relatively underutilized. Notably, a mere minority of individuals engage in recycling activities. An intriguing revelation emerged, with 65% of respondents expressing concern about the adverse repercussions of improper e-waste disposal on human health and the environment. Astonishingly, all respondents admitted to having no knowledge of the fate of their discarded electronic devices. Based on the findings gleaned from this survey, it is strongly recommended that a comprehensive review be conducted on the overarching management of e-waste stemming from this electronic waste including mobile phones, gadgets in the Hassan city of Karnataka State. The purpose of these surveys and data collection endeavors is to approximate the volume of e-waste generated through the disposal of these devices. This information is envisioned to assist stakeholders and government agencies in formulating effective and efficient legislation and policies for the proficient management of e-waste〈/span〉〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2024
    Description: 〈p〉〈em〉Indian Railroads is one of the largest railroad systems in the World. The Indian railway system has grown significantly over the years, as seen by the massive construction of its railroads; nonetheless, some accidents are caused by fractures in the railway track. Splits may occur because of the track's expansion or contraction brought on by seasonal variations. This study proposes a crack monitoring vehicle that employs an ultrasonic sensor to detect fractures on railway tracks and uses an Arduino Uno to facilitate the GSM and GPS module to send an SMS to the testing station, thereby mitigating the problems caused by these cracks. This intelligent system works like a remote monitoring system which gives an alert to stop the passage of trains in that path. The proposed model involves the use of Arduino, ultrasonic sensor, buzzer, GSM module, and GPS module.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2024
    Description: 〈p〉〈em〉Road safety is a critical concern in the modern world, where advancements in technology have made our lives easier but have also given rise to increased traffic hazards and road accidents. The "Car Accident Detector and Informer" project is an innovative system aimed at enhancing road safety by accurately detecting and informing about car accidents in real-time. This project integrates advanced sensors, microcontroller technology, and communication protocols to create an efficient accident detection and notification system. Utilizing GPS and GSM technology for precise location tracking and instant notifications, this system has the potential to reduce emergency response times, save lives, and minimize property damage. This research paper presents a detailed overview of the project, including its objectives, working principles, components, advantages, disadvantages, and prospects.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2024
    Description: 〈p〉〈em〉Spatiotemporal data analytics is a dynamic field that seeks to extract valuable information from data that integrates both spatial and temporal dimensions. This article explores the importance of this emerging field and its applications in a variety of fields, including environmental science, public health, and urban planning. Spatiotemporal data analysis addresses important research questions, such as determining event probabilities, understanding change patterns, identifying associations between events, and predicting events Future. However, this comes with many challenges, including managing large datasets, ensuring data quality, dealing with spatial and temporal autocorrelation, and more. To address these challenges, proposed solutions include data reduction and sampling, dimensionality reduction, data compression, use of spatial and temporal indexes, parallel and distributed processing, data filtering and pre- processing. Furthermore, strategies to handle spatial and temporal autocorrelation include exploratory data analysis, using spatial weight matrices, including spatially lagged variables, and regression models. spatial attribution, cluster analysis, etc for spatial autocorrelation and for temporal autocorrelation, solutions include time series analysis, differencing, ARIMA models, lagged variables, time series decomposition, exponential smoothing, state space modelling, machine learning, cross-validation, and regularization techniques. These approaches provide valuable insights to address the complexity of spatio-temporal data analysis and unlock its potential in various fields.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2024
    Description: 〈p〉〈em〉Diamond drilling, a core drilling technique employing diamond-encrusted drill bits, has emerged as a critical method for extracting cylindrical rock samples from diverse geological formations. This article provides an extensive overview of diamond drilling, encompassing its equipment, applications, challenges, and its pivotal role in geological exploration, mining, and construction projects. The versatility of diamond drilling is evident in its adaptability to various rock types, spanning the gamut from soft sedimentary strata to formidable crystalline structures. In mining, it serves as an indispensable tool for assessing the quality, depth, and size of mineral deposits. Likewise, in construction and civil engineering, diamond drilling aids in ascertaining geological conditions for safe and stable foundation design. Environmental considerations are paramount in contemporary drilling practices, with containment measures for drilling fluids to mitigate ecological impacts. Safety precautions are rigorously adhered to, ensuring the well-being of workers and the integrity of drilling operations. Furthermore, core samples extracted through diamond drilling are instrumental in geological investigations. These samples, meticulously analyzed, yield insights into rock composition, mineral content, and geological structures. They inform decisions in resource exploration, mine planning, and construction project management. This review underscores the invaluable contributions of diamond drilling to our understanding of the Earth's subsurface, emphasizing its adaptability, environmental consciousness, and safety. By examining the critical aspects of this technique, this article illuminates the profound impact of diamond drilling on various industries and the scientific community, positioning it as an indispensable tool for unlocking the secrets hidden beneath the Earth's surface.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2024
    Description: 〈p〉〈em〉The COVID-19 pandemic has underscored the necessity for digital infrastructure and accessibility, particularly in the education sector. This paper investigates the benefits and effectiveness of utilizing Green Cloud Computing (GCC) techniques for the dissemination of educational library data in remote regions of India during this critical period. The GCC model, known for its energy efficiency and reduced environmental impact, is proposed as a robust, scalable, and eco-friendly solution for providing remote educational access. The research utilizes a mixed-methods approach, incorporating both quantitative assessments of data reach and usage, and qualitative surveys to understand user experiences. The study reveals that GCC techniques can significantly improve educational resource distribution, thereby mitigating the educational disparities further exacerbated by the pandemic. These findings reinforce the potential of GCC techniques as a sustainable and inclusive technology in reshaping the educational landscape in remote regions.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2024
    Description: 〈p〉〈em〉One of the most crucial global issues of our era is climate change, whose consequences are presently being felt all over the world. As climate change keeps on forgoing, it's important that governments, organizations, and businesses take visionary actions for adapting and protecting themselves from calamities.〈/em〉〈em〉 This paper summarizes the application of Remote Sensing (RS) and 〈/em〉〈em〉Geographic Information System (〈/em〉〈em〉GIS) in observing impact of climate change on drought, soil moisture, land degradation, food security, EHIs’ characterization and blue carbon science and review of AI based climate solutions. The〈/em〉〈em〉 〈/em〉〈em〉integration of advanced machine learning algorithms,〈/em〉〈em〉 〈/em〉〈em〉real-time data analysis, and other cutting-edge technologies could lead to even more effective climate change〈/em〉〈em〉 〈/em〉〈em〉adaptation strategies. AI-activated climate change adaptation strategies have the prospective to significantly〈/em〉〈em〉 〈/em〉〈em〉improve the resilience of infrastructure communities, and businesses to the changing climate.〈/em〉〈strong〉〈/strong〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2024
    Description: 〈p〉〈em〉Urban transportation systems face significant challenges due to rapid population growth and development. This study delves into the enhancement of road safety in Navi Mumbai by employing Geographic Information System (GIS) and Remote Sensing (RS) techniques. The escalating rate of vehicular accidents in Navi Mumbai presents a pressing concern. This research investigates accident data and traffic patterns, identifying vulnerable areas prone to accidents and congestion. By conducting spatial analysis using GIS and RS, the study aims to uncover accident hotspots and traffic congestion zones, offering insights into underlying road safety issues. The research methodology involves a multi-stage process. Initial data collection from various sources, including police reports, live traffic data, and satellite imagery, forms the foundation. Geographic coordinates extracted and processed through GIS applications aid in plotting accident locations and creating density maps. Additionally, on-site investigations at strategically chosen locations provide invaluable insights into local conditions, traffic patterns, and contributing factors to congestion and accidents. The findings are presenting tailored solutions for each area, ranging from optimized traffic signal timings to infrastructural improvements. The findings of this study present actionable insights aimed at improving road safety and traffic management in Navi Mumbai. Recommendations encompassing signal optimizations, infrastructure enhancements, and community engagement strategies offer a holistic approach to mitigate traffic congestion and reduce accidents. The collaborative effort with relevant authorities, as highlighted in the study, serves as a crucial step towards implementing these recommendations for meaningful change. This research not only identifies critical areas for intervention but also serves as a model for leveraging GIS and RS techniques to enhance the road safety in urban areas, paving the way for safer and efficient transportation networks in the future.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2024
    Description: 〈p〉〈em〉The urgency to transition to renewable energy sources is underscored by the environmental crises stemming from our reliance on non-renewable fuels. This study focuses on assessing wind energy potential in Odisha, India, utilizing satellite data and Geographic Information System (GIS) technologies. The research addresses the critical need for strategic planning and site selection before investing in renewable energy infrastructure. By employing a model that integrates various free satellite datasets and leverages fundamental physical principles, the study calculates wind power density (WPD) at a height of 90 meters above the surface for both onshore and offshore locations. The methodology involves acquiring and processing datasets such as temperature, wind speed, digital elevation model (DEM), pressure, air density, and land use/land cover (LULC) classifications. The model applies equations derived from physical laws to determine key parameters necessary for calculating WPD. Specifically, temperature and pressure data are used to estimate air density, while surface roughness is assigned based on LULC classes with windspeed at 10m to extrapolate wind speed at 90 meters above ground level. The method can be used at any hub height. Results reveal significant wind energy potential in Odisha, particularly along the coastal regions. Jagatsinghpur and Puri emerge as areas with high WPD onshore, while the offshore exclusive economic zone (EEZ) of Odisha exhibits substantial wind energy resources. The model outputs provide valuable insights for various studies related to renewable energy and facilitate informed decision-making in site selection analyses. Furthermore, the study emphasizes the simplicity and effectiveness of the developed model,〈/em〉〈br〉〈em〉making it a practical tool for assessing wind energy potential in other regions as well. Overall, this research contributes to the global effort towards transitioning to sustainable energy sources and combating climate change. By highlighting the renewable energy potential of Odisha, it underscores the importance of harnessing wind energy as a viable pathway towards a cleaner, greener future.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2024
    Description: 〈p〉The research strives to enhance digital surveying methodologies for precise and detailed land mapping, utilizing advanced instruments like terrestrial scanners, SLAM scanners, total stations, DGPS, and drones. The survey conducted at Ratnagiri Hill in Udaipur aims to scrutinize and compare the merits and drawbacks of each surveying method. The study outlines a systematic process encompassing data collection, processing, output generation, and validation of survey techniques. It highlights the adaptability of these methods across various〈br〉domains, such as building conservation, restoration, and mapping of typically inaccessible areas, emphasizing their potential for time and resource savings. The research underscores the effectiveness of a one-time data collection process for subsequent work, laying the groundwork for the advancement of digital surveying technologies. When integrated with the Total station and DGPS survey, the combination of terrestrial scanners, drones, and SLAM scanners achieves a vertical accuracy of around 32 mm.〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2023
    Description: 〈p〉〈em〉Landuse play a role in the determination of main’s social, economic, and cultural progress. In general, the idea of land use is connected to the local physical environment. Landuse reflect a complex correlation between natural historical and socio-economic factors. Besides, size of holding and caste structure determine the changing of land use on the agricultural crops. Such as undulating terrain and hilly poor land determine the process of farming and as-well-as sometime at the capacity of farmers. The present study characteristics of changing pattern of land use in the Sagar district. Slightly more-than half (52.64%) of the total geographical area is net sown in 2021. This proportion is higher than the 48.93% statewide average. An additional 2.29% of the total area is made up of fallow areas. As a consequence, around 75% of the land had farmed. The forest land (24.46%) is quite similar to the average distribution. Around 1.69% of the area is designated as barren and uncultivable due to physical limitations. For a number of reasons, other uncultivated land accounts for about 10.65% of the total area. Land use patterns are influenced by cropping practices and intensity of farming as well as human social and economic position, institutional makeup, and technology advancements. The terrain of its land is ridged on a big chunk of it. Therefore, regional balances of natural processes within them are crucial prerequisites for the rising population's access to food security and its ability to get the most out of the resources at hand.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2023
    Description: 〈p〉〈em〉The ruling aspect of the present investigation is to study the textural characters as well as to understand the grain size relationship and distributionby engaging granulometric analysis, along the Pennar estuary, South east coast of India. An aggregate of 36 Surface sediment samples were retrieved and a comprehensive study of textural parameters and various size distribution of sediments were analysed at six various stations in six different micro – environments viz., dune, backshore, berm, upper foreshore (UFS), middle foreshore (MFS) and lower foreshore (LFS).  These were further subjected to statistical treatment viz., Mean size (Mz), Skewness (Sk), Standard deviation (σ〈strong〉〈sub〉I〈/sub〉〈/strong〉), and Kurtosis (K〈sub〉G〈/sub〉). The procured results indicates that the sediment samples were coarse to fine grained, very negatively skewed to positively skewed, very well to moderately sorted, and platy tovery leptokurtic in nature and also indicates two mixed environments at some stations. Scatter plots were help to understand the mode of deposition, geological significance and transportation of grains along the coast. Scatter plots also divulges that the sediments along the coast were mainly associated with fluvial process. C-M diagrams demonstrate the type of transportation and deposition of the beach sediments. Ebbing and flooding shows prominent role in changing the characteristics of grains in the Pennar estuary, especially in the estuarine mouth and adjoining river areas.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2023
    Description: 〈p〉〈em〉〈span〉Currently, wind and solar power generation systems have many drawbacks. Wind and solar power generation Full use of new energy will break the barrier to growth. Location for wind/solar hybrid power plant the primary problem is how to choose scientifically. This article selects six wind/solar hybrid power plants and implements them as a case study and evaluates these six areas via VIKOR by weighting the indicators through the MCDM method. Conclusions Related research findings and better valid, this demonstrates the feasibility and effectiveness of the method. This macro-site selection plants may provide some theoretical basis. In this statistical methods in the literature and established by statistical analysis. Tamil Nadu, Rajasthan, Maharashtra, Gujarat, Andhra Pradesh, Karnataka taken this alternative in this method and evaluation parameters are Total investment, Wind direction, Wind speed and speed change, sunshine stabilization, wind power density, energy saving, Environmental factors. Gujarat is on 2nd rank, Rajasthan is on the 1st rank, Tamil Nadu is on the 3rd rank, Indian Karnataka is on the 4th rank, Maharashtra is on the 5th rank and finally Andhra Pradesh is on the 6th rank.〈/span〉〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2023
    Description: 〈p〉〈em〉Flood hazard mapping, which uses model and satellite remote sensing data, is extremely useful for flood monitoring and risk management. The flood inundation extent and flooding depth on Majuli Island and its surrounding area in Assam, India, were simulated using MIKE FLOOD, a coupled 1D-2D hydrodynamic model. MIKE FLOOD is a platform that integrates the MIKE Hydro River (1D) and MIKE 21 FM (2D) models into a dynamically coupled single modeling framework. The study employed daily discharge and water level data from several gauging stations operated by the Centre Water Commission (CWC), Global Flood Monitoring System (GFMS), and Water Resource Department (WRD) of Assam. First, the MIKE Hydro River (1D) model was calibrated using discharge and water level data from 2016 to 2018 and validated for the period of 2019-2021. The MIKE Hydro River (1D) model's calibration and validation results were evaluated using a numerous of performance metrics. From ALOS PALSAR DEM data / SRTM DEM data, a fine mesh and bathymetry of Majuli Island with a spatial resolution of 10m has been created and provided as an input to the MIKE 21 FM (2D, Flow Model). The MIKE Hydro River (1D) and MIKE 21 FM (2D) models were then linked to the MIKE FLOOD model for simulating two-dimensional flood inundations in the study area through lateral linkages. Flood inundation has been simulated for the year 2020, and the model's maximum flood inundation extent has been compared to the actual flooded area retrieved from Sentinel-1 C-Band satellite data. The R〈sup〉2〈/sup〉 in the study area was ranging between 0.86 and 0.97, but the WBL in the MIKE Hydro River model was less than 1.23. On the opposite hand, the MIKE FLOOD's total accuracy is 93.6 percent according to the confusion matrix. According to the most recent model simulation, flooding will occur between July 19 and July 21, 2020, with the greatest and lowest flood depths being 2.38 and 0.786 m, respectively〈/em〉〈em〉. In addition, the MIKE FLOOD model may be used for flood control in the future, and this research will aid policymakers in the field of water management in achieving successful mitigation measures.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2023
    Description: 〈p〉〈em〉A simulation tool of a sky wave over-the-horizon radar performance and detection process includes many stages based on different models, which creates a synthetic searching scenario as a first step followed by a digital signal processing to detect and locate a potential target. Its accuracy will depend on the quality of the input and adequacy degree of the model assumptions. A sensitivity analysis of this simulation tool is carried out analyzing outputs’ variation as a consequence of changes in input factors. The architecture of this tool allow easy implementation and the study of input variables impact on detection and location results that can be useful towards dimensioning features and elements of a real radar〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2023
    Description: 〈p〉〈em〉This work provides an overview of the use of remote sensing and Geographic Information System (GIS) techniques for mineral exploration. The integration and analysis of various data types, including geological, geochemical, geophysical, and remote sensing data, using remote sensing and GIS tools allow for the creation of comprehensive maps and models of mineral deposits. The advantages of remote sensing and GIS in mineral exploration include more effective targeting of exploration activities, assessment of environmental and social impacts of mining activities, and creation of predictive models of mineral deposits. However, limitations to their use include the quality and resolution of input data, the expertise of the user, and availability and accessibility of data. As technology continues to improve and the availability of data increases, the use of remote sensing and GIS is likely to become even more important for the efficient and sustainable exploration and development of mineral resources.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2023
    Description: 〈p〉〈em〉Himachal Pradesh state is very vulnerable to flash floods, landslides, and riverbank erosion. Efforts have been made to develop a tool that can predict these disasters before the event, so that the authorities can take the protection measures. A web-enable 〈/em〉〈em〉vulnerability assessment tool has been developed for identification of vulnerable reaches for flood, riverbank erosion, and landslide in Himachal Pradesh. 〈/em〉〈em〉Historical excessive rainfall events, flash floods events and its causes, largest instrumented earthquake events have been reviewed and analyzed in detail. A flash flood risk index for rainfall-induced events occurring during the monsoon, glacial lake outburst flood (GLOF) risk index to monitor glacial lakes, and landslide risk index have been developed. 〈/em〉〈em〉River model (HecRAS,1D) model has developed for 〈/em〉〈em〉inflow forecasting system and 〈/em〉〈em〉early warning system; flood model (TUFLOW/SOBEK, 2D) has developed to identify the flood prone areas; river morphological model (Delft, 3D) has developed to find out the morphological active areas; historical satellite imageries between 1973-2023 have been analyzed by using 〈/em〉〈em〉artificial intelligence technology〈/em〉〈em〉 to identify the riverbank erosion areas, landslide active area; these imageries have also analyzed for snowmelt forecasting and GLOF study. 〈/em〉〈em〉Multi criteria analysis model has been developed for identification of vulnerable reaches.〈/em〉〈em〉 These all five activities have been integrated in the web-enable real-time vulnerability assessment tool of Himachal Pradesh. This tool can be identified of vulnerable reaches for riverbank erosion, flood, and landslide in real-time meaner, and can predict the flash floods. The tool is available at: https://www.hpVulnerableReach-kupa.com. This tool is very useful for authorities, communities, and stakeholders, who take the protection measures during the disaster. There is still a need for some improvement, this tool can be upgraded at large scale, which can be done with the help of concern department or can be done with more accurate data as well when funds will available.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2023
    Description: 〈p〉〈em〉Urban sprawl is increasing rapidly with the increased rate of rural to urban migration and with the tremendous population growth. Due to rural-urban migration and continuous population growth; urban centers tend to expand outwards to lodge the ever-increasing population pressure. Urban sprawl has to be considered in spatio-temporal terms to understand the phenomena of urban growth, land use and land transformation.  Urban sprawl is momentous to the quality of life of urban dwellers as well as quantity of rural land lost and land degradation; key issues of agricultural as well as environmental perspectives. Himalayan state of Uttarakhand is witnessing steady and continuous urban growth. Urban sprawl in Ranikhet tehsil has resulted in loss of productive agricultural land of the surrounding rural areas, open green spaces, loss of surface water bodies, and depletion of groundwater too. Problems of dwellings, slums, and unhygienic livings are also growing rapidly with the rapid urban sprawl. There is regular competition between urban and rural areas for land needed for growth and development.  The present study finds out that the built-up areas of different towns of Ranikhet tehsil have been increased from 4.92km〈sup〉2〈/sup〉 in 2000 to 6.19km〈sup〉2〈/sup〉 in 2010 which has further been increased to 7.70km〈sup〉2〈/sup〉 in 2020 witnesses a continuous struggle for space in Ranikhet tehsil.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2023
    Description: 〈p〉〈em〉A simulation tool of a sky wave over-the-horizon radar performance and detection process includes many stages based on different models, which creates a synthetic searching scenario as a first step followed by a digital signal processing to detect and locate a potential target. Its accuracy will depend on the quality of the input and adequacy degree of the model assumptions. A sensitivity analysis of this simulation tool is carried out analyzing outputs’ variation as a consequence of changes in input factors. The architecture of this tool allow easy implementation and the study of input variables impact on detection and location results that can be useful towards dimensioning features and elements of a real radar〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2023
    Description: 〈p〉〈em〉This work provides an overview of the use of remote sensing and Geographic Information System (GIS) techniques for mineral exploration. The integration and analysis of various data types, including geological, geochemical, geophysical, and remote sensing data, using remote sensing and GIS tools allow for the creation of comprehensive maps and models of mineral deposits. The advantages of remote sensing and GIS in mineral exploration include more effective targeting of exploration activities, assessment of environmental and social impacts of mining activities, and creation of predictive models of mineral deposits. However, limitations to their use include the quality and resolution of input data, the expertise of the user, and availability and accessibility of data. As technology continues to improve and the availability of data increases, the use of remote sensing and GIS is likely to become even more important for the efficient and sustainable exploration and development of mineral resources.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2023
    Description: 〈p〉〈em〉Himachal Pradesh state is very vulnerable to flash floods, landslides, and riverbank erosion. Efforts have been made to develop a tool that can predict these disasters before the event, so that the authorities can take the protection measures. A web-enable 〈/em〉〈em〉vulnerability assessment tool has been developed for identification of vulnerable reaches for flood, riverbank erosion, and landslide in Himachal Pradesh. 〈/em〉〈em〉Historical excessive rainfall events, flash floods events and its causes, largest instrumented earthquake events have been reviewed and analyzed in detail. A flash flood risk index for rainfall-induced events occurring during the monsoon, glacial lake outburst flood (GLOF) risk index to monitor glacial lakes, and landslide risk index have been developed. 〈/em〉〈em〉River model (HecRAS,1D) model has developed for 〈/em〉〈em〉inflow forecasting system and 〈/em〉〈em〉early warning system; flood model (TUFLOW/SOBEK, 2D) has developed to identify the flood prone areas; river morphological model (Delft, 3D) has developed to find out the morphological active areas; historical satellite imageries between 1973-2023 have been analyzed by using 〈/em〉〈em〉artificial intelligence technology〈/em〉〈em〉 to identify the riverbank erosion areas, landslide active area; these imageries have also analyzed for snowmelt forecasting and GLOF study. 〈/em〉〈em〉Multi criteria analysis model has been developed for identification of vulnerable reaches.〈/em〉〈em〉 These all five activities have been integrated in the web-enable real-time vulnerability assessment tool of Himachal Pradesh. This tool can be identified of vulnerable reaches for riverbank erosion, flood, and landslide in real-time meaner, and can predict the flash floods. The tool is available at: https://www.hpVulnerableReach-kupa.com. This tool is very useful for authorities, communities, and stakeholders, who take the protection measures during the disaster. There is still a need for some improvement, this tool can be upgraded at large scale, which can be done with the help of concern department or can be done with more accurate data as well when funds will available.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2023
    Description: 〈p〉〈em〉Urban sprawl is increasing rapidly with the increased rate of rural to urban migration and with the tremendous population growth. Due to rural-urban migration and continuous population growth; urban centers tend to expand outwards to lodge the ever-increasing population pressure. Urban sprawl has to be considered in spatio-temporal terms to understand the phenomena of urban growth, land use and land transformation.  Urban sprawl is momentous to the quality of life of urban dwellers as well as quantity of rural land lost and land degradation; key issues of agricultural as well as environmental perspectives. Himalayan state of Uttarakhand is witnessing steady and continuous urban growth. Urban sprawl in Ranikhet tehsil has resulted in loss of productive agricultural land of the surrounding rural areas, open green spaces, loss of surface water bodies, and depletion of groundwater too. Problems of dwellings, slums, and unhygienic livings are also growing rapidly with the rapid urban sprawl. There is regular competition between urban and rural areas for land needed for growth and development.  The present study finds out that the built-up areas of different towns of Ranikhet tehsil have been increased from 4.92km〈sup〉2〈/sup〉 in 2000 to 6.19km〈sup〉2〈/sup〉 in 2010 which has further been increased to 7.70km〈sup〉2〈/sup〉 in 2020 witnesses a continuous struggle for space in Ranikhet tehsil.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2023
    Description: 〈p〉〈em〉Flood hazard mapping, which uses model and satellite remote sensing data, is extremely useful for flood monitoring and risk management. The flood inundation extent and flooding depth on Majuli Island and its surrounding area in Assam, India, were simulated using MIKE FLOOD, a coupled 1D-2D hydrodynamic model. MIKE FLOOD is a platform that integrates the MIKE Hydro River (1D) and MIKE 21 FM (2D) models into a dynamically coupled single modeling framework. The study employed daily discharge and water level data from several gauging stations operated by the Centre Water Commission (CWC), Global Flood Monitoring System (GFMS), and Water Resource Department (WRD) of Assam. First, the MIKE Hydro River (1D) model was calibrated using discharge and water level data from 2016 to 2018 and validated for the period of 2019-2021. The MIKE Hydro River (1D) model's calibration and validation results were evaluated using a numerous of performance metrics. From ALOS PALSAR DEM data / SRTM DEM data, a fine mesh and bathymetry of Majuli Island with a spatial resolution of 10m has been created and provided as an input to the MIKE 21 FM (2D, Flow Model). The MIKE Hydro River (1D) and MIKE 21 FM (2D) models were then linked to the MIKE FLOOD model for simulating two-dimensional flood inundations in the study area through lateral linkages. Flood inundation has been simulated for the year 2020, and the model's maximum flood inundation extent has been compared to the actual flooded area retrieved from Sentinel-1 C-Band satellite data. The R〈sup〉2〈/sup〉 in the study area was ranging between 0.86 and 0.97, but the WBL in the MIKE Hydro River model was less than 1.23. On the opposite hand, the MIKE FLOOD's total accuracy is 93.6 percent according to the confusion matrix. According to the most recent model simulation, flooding will occur between July 19 and July 21, 2020, with the greatest and lowest flood depths being 2.38 and 0.786 m, respectively〈/em〉〈em〉. In addition, the MIKE FLOOD model may be used for flood control in the future, and this research will aid policymakers in the field of water management in achieving successful mitigation measures.〈/em〉〈/p〉
    Print ISSN: 2321-421X
    Electronic ISSN: 2230-7990
    Topics: Geography , Geosciences
    Published by STM Journals
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2022
    Description: scikit-multimodallearn is a Python library for multimodal supervised learning, licensed under Free BSD, and compatible with the well-known scikit-learn toolbox (Fabian Pedregosa, 2011). This paper details the content of the library, including a specific multimodal data formatting and classification and regression algorithms. Use cases and examples are also provided.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2022
    Description: We propose a theoretical framework for approximate planning and learning in partially observed systems. Our framework is based on the fundamental notion of information state. We provide two definitions of information state---i) a function of history which is sufficient to compute the expected reward and predict its next value; ii) a function of the history which can be recursively updated and is sufficient to compute the expected reward and predict the next observation. An information state always leads to a dynamic programming decomposition. Our key result is to show that if a function of the history (called AIS) approximately satisfies the properties of the information state, then there is a corresponding approximate dynamic program. We show that the policy computed using this is approximately optimal with bounded loss of optimality. We show that several approximations in state, observation and action spaces in literature can be viewed as instances of AIS. In some of these cases, we obtain tighter bounds. A salient feature of AIS is that it can be learnt from data. We present AIS based multi-time scale policy gradient algorithms and detailed numerical experiments with low, moderate and high dimensional environments.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2022
    Description: Bayesian Likelihood-Free Inference (LFI) approaches allow to obtain posterior distributions for stochastic models with intractable likelihood, by relying on model simulations. In Approximate Bayesian Computation (ABC), a popular LFI method, summary statistics are used to reduce data dimensionality. ABC algorithms adaptively tailor simulations to the observation in order to sample from an approximate posterior, whose form depends on the chosen statistics. In this work, we introduce a new way to learn ABC statistics: we first generate parameter-simulation pairs from the model independently on the observation; then, we use Score Matching to train a neural conditional exponential family to approximate the likelihood. The exponential family is the largest class of distributions with fixed-size sufficient statistics; thus, we use them in ABC, which is intuitively appealing and has state-of-the-art performance. In parallel, we insert our likelihood approximation in an MCMC for doubly intractable distributions to draw posterior samples. We can repeat that for any number of observations with no additional model simulations, with performance comparable to related approaches. We validate our methods on toy models with known likelihood and a large-dimensional time-series model.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2022
    Description: We introduce a procedure for conditional density estimation under logarithmic loss, which we call SMP (Sample Minmax Predictor). This estimator minimizes a new general excess risk bound for statistical learning. On standard examples, this bound scales as $d/n$ with $d$ the model dimension and $n$ the sample size, and critically remains valid under model misspecification. Being an improper (out-of-model) procedure, SMP improves over within-model estimators such as the maximum likelihood estimator, whose excess risk degrades under misspecification. Compared to approaches reducing to the sequential problem, our bounds remove suboptimal $\log n$ factors and can handle unbounded classes. For the Gaussian linear model, the predictions and risk bound of SMP are governed by leverage scores of covariates, nearly matching the optimal risk in the well-specified case without conditions on the noise variance or approximation error of the linear model. For logistic regression, SMP provides a non-Bayesian approach to calibration of probabilistic predictions relying on virtual samples, and can be computed by solving two logistic regressions. It achieves a non-asymptotic excess risk of $O((d + B^2R^2)/n)$, where $R$ bounds the norm of features and $B$ that of the comparison parameter; by contrast, no within-model estimator can achieve better rate than $\min({B R}/{\sqrt{n}}, {d e^{BR}}/{n} )$ in general. This provides a more practical alternative to Bayesian approaches, which require approximate posterior sampling, thereby partly addressing a question raised by Foster et al. (2018).
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Publication Date: 2022
    Description: Graph representation learning has many real-world applications, from self-driving LiDAR, 3D computer vision to drug repurposing, protein classification, social networks analysis. An adequate representation of graph data is vital to the learning performance of a statistical or machine learning model for graph-structured data. This paper proposes a novel multiscale representation system for graph data, called decimated framelets, which form a localized tight frame on the graph. The decimated framelet system allows storage of the graph data representation on a coarse-grained chain and processes the graph data at multi scales where at each scale, the data is stored on a subgraph. Based on this, we establish decimated G-framelet transforms for the decomposition and reconstruction of the graph data at multi resolutions via a constructive data-driven filter bank. The graph framelets are built on a chain-based orthonormal basis that supports fast graph Fourier transforms. From this, we give a fast algorithm for the decimated G-framelet transforms, or FGT, that has linear computational complexity O(N) for a graph of size N. The effectiveness for constructing the decimated framelet system and the FGT is demonstrated by a simulated example of random graphs and real-world applications, including multiresolution analysis for traffic network and representation learning of graph neural networks for graph classification tasks.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2022
    Description: DoubleML is an open-source Python library implementing the double machine learning framework of Chernozhukov et al. (2018) for a variety of causal models. It contains functionalities for valid statistical inference on causal parameters when the estimation of nuisance parameters is based on machine learning methods. The object-oriented implementation of DoubleML provides a high flexibility in terms of model specifications and makes it easily extendable. The package is distributed under the MIT license and relies on core libraries from the scientific Python ecosystem: scikit-learn, numpy, pandas, scipy, statsmodels and joblib. Source code, documentation and an extensive user guide can be found at https://github.com/DoubleML/doubleml-for-py and https://docs.doubleml.org.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2022
    Description: Conditional density estimation is a fundamental problem in statistics, with scientific and practical applications in biology, economics, finance and environmental studies, to name a few. In this paper, we propose a conditional density estimator based on gradient boosting and Lindsey's method (LinCDE). LinCDE admits flexible modeling of the density family and can capture distributional characteristics like modality and shape. In particular, when suitably parametrized, LinCDE will produce smooth and non-negative density estimates. Furthermore, like boosted regression trees, LinCDE does automatic feature selection. We demonstrate LinCDE's efficacy through extensive simulations and three real data examples.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2022
    Description: In this paper, we study the concentration property of stochastic gradient descent (SGD) solutions. In existing concentration analyses, researchers impose restrictive requirements on the gradient noise, such as boundedness or sub-Gaussianity. We consider a much richer class of noise where only finitely-many moments are required, thus allowing heavy-tailed noises. In particular, we obtain Nagaev type high-probability upper bounds for the estimation errors of averaged stochastic gradient descent (ASGD) in a linear model. Specifically, we prove that, after $T$ steps of SGD, the ASGD estimate achieves an $O(\sqrt{\log(1/\delta)/T} + (\delta T^{q-1})^{-1/q})$ error rate with probability at least $1-\delta$, where $q〉2$ controls the tail of the gradient noise. In comparison, one has the $O(\sqrt{\log(1/\delta)/T})$ error rate for sub-Gaussian noises. We also show that the Nagaev type upper bound is almost tight through an example, where the exact asymptotic form of the tail probability can be derived. Our concentration analysis indicates that, in the case of heavy-tailed noises, the polynomial dependence on the failure probability $\delta$ is generally unavoidable for the error rate of SGD.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2022
    Description: In increasingly many settings, data sets consist of multiple samples from a population of networks, with vertices aligned across networks; for example, brain connectivity networks in neuroscience. We consider the setting where the observed networks have a shared expectation, but may differ in the noise structure on their edges. Our approach exploits the shared mean structure to denoise edge-level measurements of the observed networks and estimate the underlying population-level parameters. We also explore the extent to which edge-level errors influence estimation and downstream inference. In the process, we establish a finite-sample concentration inequality for the low-rank eigenvalue truncation of a random weighted adjacency matrix, which may be of independent interest. The proposed approach is illustrated on synthetic networks and on data from an fMRI study of schizophrenia.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2022
    Description: Performing exact Bayesian inference for complex models is computationally intractable. Markov chain Monte Carlo (MCMC) algorithms can provide reliable approximations of the posterior distribution but are expensive for large data sets and high-dimensional models. A standard approach to mitigate this complexity consists in using subsampling techniques or distributing the data across a cluster. However, these approaches are typically unreliable in high-dimensional scenarios. We focus here on a recent alternative class of MCMC schemes exploiting a splitting strategy akin to the one used by the celebrated alternating direction method of multipliers (ADMM) optimization algorithm. These methods appear to provide empirically state-of-the-art performance but their theoretical behavior in high dimension is currently unknown. In this paper, we propose a detailed theoretical study of one of these algorithms known as the split Gibbs sampler. Under regularity conditions, we establish explicit convergence rates for this scheme using Ricci curvature and coupling ideas. We support our theory with numerical illustrations.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2022
    Description: We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within $(1+\mathcal{O}({\varepsilon}))$ of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm, called LSAR, for fitting an appropriate AR model to big time series data. Our proposed algorithm is guaranteed, with high probability, to find the maximum likelihood estimates of the parameters of the underlying true AR model and has a worst case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale synthetic as well as real data highly support the theoretical results and reveal the efficacy of this new approach.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2022
    Description: In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method for black-box mini-optimization where only function values can be obtained. Moreover, we prove that our Acc-ZOM method achieves a lower query complexity of $\tilde{O}(d^{3/4}\epsilon^{-3})$ for finding an $\epsilon$-stationary point, which improves the best known result by a factor of $O(d^{1/4})$ where $d$ denotes the variable dimension. In particular, our Acc-ZOM does not need large batches required in the existing zeroth-order stochastic algorithms. Meanwhile, we propose an accelerated zeroth-order momentum descent ascent (Acc-ZOMDA) method for black-box minimax optimization, where only function values can be obtained. Our Acc-ZOMDA obtains a low query complexity of $\tilde{O}((d_1+d_2)^{3/4}\kappa_y^{4.5}\epsilon^{-3})$ without requiring large batches for finding an $\epsilon$-stationary point, where $d_1$ and $d_2$ denote variable dimensions and $\kappa_y$ is condition number. Moreover, we propose an accelerated first-order momentum descent ascent (Acc-MDA) method for minimax optimization, whose explicit gradients are accessible. Our Acc-MDA achieves a low gradient complexity of $\tilde{O}(\kappa_y^{4.5}\epsilon^{-3})$ without requiring large batches for finding an $\epsilon$-stationary point. In particular, our Acc-MDA can obtain a lower gradient complexity of $\tilde{O}(\kappa_y^{2.5}\epsilon^{-3})$ with a batch size $O(\kappa_y^4)$, which improves the best known result by a factor of $O(\kappa_y^{1/2})$. Extensive experimental results on black-box adversarial attack to deep neural networks and poisoning attack to logistic regression demonstrate efficiency of our algorithms.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2022
    Description: In this paper, we study two challenging problems in explainable AI (XAI) and data clustering. The first is how to directly design a neural network with inherent interpretability, rather than giving post-hoc explanations of a black-box model. The second is implementing discrete $k$-means with a differentiable neural network that embraces the advantages of parallel computing, online clustering, and clustering-favorable representation learning. To address these two challenges, we design a novel neural network, which is a differentiable reformulation of the vanilla $k$-means, called inTerpretable nEuraL cLustering (TELL). Our contributions are threefold. First, to the best of our knowledge, most existing XAI works focus on supervised learning paradigms. This work is one of the few XAI studies on unsupervised learning, in particular, data clustering. Second, TELL is an interpretable, or the so-called intrinsically explainable and transparent model. In contrast, most existing XAI studies resort to various means for understanding a black-box model with post-hoc explanations. Third, from the view of data clustering, TELL possesses many properties highly desired by $k$-means, including but not limited to online clustering, plug-and-play module, parallel computing, and provable convergence. Extensive experiments show that our method achieves superior performance comparing with 14 clustering approaches on three challenging data sets. The source code could be accessed at www.pengxi.me.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2022
    Description: We consider a problem of manifold estimation from noisy observations. Many manifold learning procedures locally approximate a manifold by a weighted average over a small neighborhood. However, in the presence of large noise, the assigned weights become so corrupted that the averaged estimate shows very poor performance. We suggest a structure-adaptive procedure, which simultaneously reconstructs a smooth manifold and estimates projections of the point cloud onto this manifold. The proposed approach iteratively refines the weights on each step, using the structural information obtained at previous steps. After several iterations, we obtain nearly “oracle” weights, so that the final estimates are nearly efficient even in the presence of relatively large noise. In our theoretical study, we establish tight lower and upper bounds proving asymptotic optimality of the method for manifold estimation under the Hausdorff loss, provided that the noise degrades to zero fast enough.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2022
    Description: Although various distributed machine learning schemes have been proposed recently for purely linear models and fully nonparametric models, little attention has been paid to distributed optimization for semi-parametric models with multiple structures (e.g. sparsity, linearity and nonlinearity). To address these issues, the current paper proposes a new communication-efficient distributed learning algorithm for sparse partially linear models with an increasing number of features. The proposed method is based on the classical divide and conquer strategy for handling big data and the computation on each subsample consists of a debiased estimation of the doubly regularized least squares approach. With the proposed method, we theoretically prove that our global parametric estimator can achieve the optimal parametric rate in our semi-parametric model given an appropriate partition on the total data. Specifically, the choice of data partition relies on the underlying smoothness of the nonparametric component, and it is adaptive to the sparsity parameter. Finally, some simulated experiments are carried out to illustrate the empirical performances of our debiased technique under the distributed setting.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2022
    Description: Real-world applications of machine learning tools in high-stakes domains are often regulated to be fair, in the sense that the predicted target should satisfy some quantitative notion of parity with respect to a protected attribute. However, the exact tradeoff between fairness and accuracy is not entirely clear, even for the basic paradigm of classification problems. In this paper, we characterize an inherent tradeoff between statistical parity and accuracy in the classification setting by providing a lower bound on the sum of group-wise errors of any fair classifiers. Our impossibility theorem could be interpreted as a certain uncertainty principle in fairness: if the base rates differ among groups, then any fair classifier satisfying statistical parity has to incur a large error on at least one of the groups. We further extend this result to give a lower bound on the joint error of any (approximately) fair classifiers, from the perspective of learning fair representations. To show that our lower bound is tight, assuming oracle access to Bayes (potentially unfair) classifiers, we also construct an algorithm that returns a randomized classifier which is both optimal (in terms of accuracy) and fair. Interestingly, when the protected attribute can take more than two values, an extension of this lower bound does not admit an analytic solution. Nevertheless, in this case, we show that the lower bound can be efficiently computed by solving a linear program, which we term as the TV-Barycenter problem, a barycenter problem under the TV-distance. On the upside, we prove that if the group-wise Bayes optimal classifiers are close, then learning fair representations leads to an alternative notion of fairness, known as the accuracy parity, which states that the error rates are close between groups. Finally, we also conduct experiments on real-world datasets to confirm our theoretical findings.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2022
    Description: We study the optimal transport problem for pairs of stationary finite-state Markov chains, with an emphasis on the computation of optimal transition couplings. Transition couplings are a constrained family of transport plans that capture the dynamics of Markov chains. Solutions of the optimal transition coupling (OTC) problem correspond to alignments of the two chains that minimize long-term average cost. We establish a connection between the OTC problem and Markov decision processes, and show that solutions of the OTC problem can be obtained via an adaptation of policy iteration. For settings with large state spaces, we develop a fast approximate algorithm based on an entropy-regularized version of the OTC problem, and provide bounds on its per-iteration complexity. We establish a stability result for both the regularized and unregularized algorithms, from which a statistical consistency result follows as a corollary. We validate our theoretical results empirically through a simulation study, demonstrating that the approximate algorithm exhibits faster overall runtime with low error. Finally, we extend the setting and application of our methods to hidden Markov models, and illustrate the potential use of the proposed algorithms in practice with an application to computer-generated music.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2022
    Description: In high dimension, low sample size (HDLSS) settings, classifiers based on Euclidean distances like the nearest neighbor classifier and the average distance classifier perform quite poorly if differences between locations of the underlying populations get masked by scale differences. To rectify this problem, several modifications of these classifiers have been proposed in the literature. However, existing methods are confined to location and scale differences only, and they often fail to discriminate among populations differing outside of the first two moments. In this article, we propose some simple transformations of these classifiers resulting in improved performance even when the underlying populations have the same location and scale. We further propose a generalization of these classifiers based on the idea of grouping of variables. High-dimensional behavior of the proposed classifiers is studied theoretically. Numerical experiments with a variety of simulated examples as well as an extensive analysis of benchmark data sets from three different databases exhibit advantages of the proposed methods.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2022
    Description: Deep learning uses neural networks which are parameterised by their weights. The neural networks are usually trained by tuning the weights to directly minimise a given loss function. In this paper we propose to re-parameterise the weights into targets for the firing strengths of the individual nodes in the network. Given a set of targets, it is possible to calculate the weights which make the firing strengths best meet those targets. It is argued that using targets for training addresses the problem of exploding gradients, by a process which we call cascade untangling, and makes the loss-function surface smoother to traverse, and so leads to easier, faster training, and also potentially better generalisation, of the neural network. It also allows for easier learning of deeper and recurrent network structures. The necessary conversion of targets to weights comes at an extra computational expense, which is in many cases manageable. Learning in target space can be combined with existing neural-network optimisers, for extra gain. Experimental results show the speed of using target space, and examples of improved generalisation, for fully-connected networks and convolutional networks, and the ability to recall and process long time sequences and perform natural-language processing with recurrent networks.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2022
    Description: With few exceptions, neural networks have been relying on backpropagation and gradient descent as the inference engine in order to learn the model parameters, because closed-form Bayesian inference for neural networks has been considered to be intractable. In this paper, we show how we can leverage the tractable approximate Gaussian inference's (TAGI) capabilities to infer hidden states, rather than only using it for inferring the network's parameters. One novel aspect is that it allows inferring hidden states through the imposition of constraints designed to achieve specific objectives, as illustrated through three examples: (1) the generation of adversarial-attack examples, (2) the usage of a neural network as a black-box optimization method, and (3) the application of inference on continuous-action reinforcement learning. In these three examples, the constrains are in (1), a target label chosen to fool a neural network, and in (2 and 3) the derivative of the network with respect to its input that is set to zero in order to infer the optimal input values that are either maximizing or minimizing it. These applications showcase how tasks that were previously reserved to gradient-based optimization approaches can now be approached with analytically tractable inference.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2022
    Description: In this article, we dwell into the class of so-called ill-posed Linear Inverse Problems (LIP) which simply refer to the task of recovering the entire signal from its relatively few random linear measurements. Such problems arise in a variety of settings with applications ranging from medical image processing, recommender systems, etc. We propose a slightly generalized version of the error constrained linear inverse problem and obtain a novel and equivalent convex-concave min-max reformulation by providing an exposition to its convex geometry. Saddle points of the min-max problem are completely characterized in terms of a solution to the LIP, and vice versa. Applying simple saddle point seeking ascend-descent type algorithms to solve the min-max problems provides novel and simple algorithms to find a solution to the LIP. Moreover, the reformulation of an LIP as the min-max problem provided in this article is crucial in developing methods to solve the dictionary learning problem with almost sure recovery constraints.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2022
    Description: We consider the classic supervised learning problem where a continuous non-negative random label $Y$ (e.g. a random duration) is to be predicted based upon observing a random vector $X$ valued in $\mathbb{R}^d$ with $d\geq 1$ by means of a regression rule with minimum least square error. In various applications, ranging from industrial quality control to public health through credit risk analysis for instance, training observations can be right censored, meaning that, rather than on independent copies of $(X,Y)$, statistical learning relies on a collection of $n\geq 1$ independent realizations of the triplet $(X, \; \min\{Y,\; C\},\; \delta)$, where $C$ is a nonnegative random variable with unknown distribution, modelling censoring and $\delta=\mathbb{I}\{Y\leq C\}$ indicates whether the duration is right censored or not. As ignoring censoring in the risk computation may clearly lead to a severe underestimation of the target duration and jeopardize prediction, we consider a plug-in estimate of the true risk based on a Kaplan-Meier estimator of the conditional survival function of the censoring $C$ given $X$, referred to as Beran risk, in order to perform empirical risk minimization. It is established, under mild conditions, that the learning rate of minimizers of this biased/weighted empirical risk functional is of order $O_{\mathbb{P}}(\sqrt{\log(n)/n})$ when ignoring model bias issues inherent to plug-in estimation, as can be attained in absence of censoring. Beyond theoretical results, numerical experiments are presented in order to illustrate the relevance of the approach developed.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2022
    Description: We perform a systematic study of the approximation properties and optimization dynamics of recurrent neural networks (RNNs) when applied to learn input-output relationships in temporal data. We consider the simple but representative setting of using continuous-time linear RNNs to learn from data generated by linear relationships. On the approximation side, we prove a direct and an inverse approximation theorem of linear functionals using RNNs, which reveal the intricate connections between memory structures in the target and the corresponding approximation efficiency. In particular, we show that temporal relationships can be effectively approximated by RNNs if and only if the former possesses sufficient memory decay. On the optimization front, we perform detailed analysis of the optimization dynamics, including a precise understanding of the difficulty that may arise in learning relationships with long-term memory. The term “curse of memory” is coined to describe the uncovered phenomena, akin to the “curse of dimension” that plagues high-dimensional function approximation. These results form a relatively complete picture of the interaction of memory and recurrent structures in the linear dynamical setting.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2022
    Description: We prove two universal approximation theorems for a range of dropout neural networks. These are feed-forward neural networks in which each edge is given a random $\{0,1\}$-valued filter, that have two modes of operation: in the first each edge output is multiplied by its random filter, resulting in a random output, while in the second each edge output is multiplied by the expectation of its filter, leading to a deterministic output. It is common to use the random mode during training and the deterministic mode during testing and prediction. Both theorems are of the following form: Given a function to approximate and a threshold $\varepsilon〉0$, there exists a dropout network that is $\varepsilon$-close in probability and in $L^q$. The first theorem applies to dropout networks in the random mode. It assumes little on the activation function, applies to a wide class of networks, and can even be applied to approximation schemes other than neural networks. The core is an algebraic property that shows that deterministic networks can be exactly matched in expectation by random networks. The second theorem makes stronger assumptions and gives a stronger result. Given a function to approximate, it provides existence of a network that approximates in both modes simultaneously. Proof components are a recursive replacement of edges by independent copies, and a special first-layer replacement that couples the resulting larger network to the input. The functions to be approximated are assumed to be elements of general normed spaces, and the approximations are measured in the corresponding norms. The networks are constructed explicitly. Because of the different methods of proof, the two results give independent insight into the approximation properties of random dropout networks. With this, we establish that dropout neural networks broadly satisfy a universal-approximation property.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2022
    Description: An innovations sequence of a time series is a sequence of independent and identically distributed random variables with which the original time series has a causal representation. The innovation at a time is statistically independent of the history of the time series. As such, it represents the new information contained at present but not in the past. Because of its simple probability structure, the innovations sequence is the most efficient signature of the original. Unlike the principle or independent component representations, an innovations sequence preserves not only the complete statistical properties but also the temporal order of the original time series. An long-standing open problem is to find a computationally tractable way to extract an innovations sequence of non-Gaussian processes. This paper presents a deep learning approach, referred to as Innovations Autoencoder (IAE), that extracts innovations sequences using a causal convolutional neural network. An application of IAE to the one-class anomalous sequence detection problem with unknown anomaly and anomaly-free models is also presented.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2022
    Description: We present a uniform analysis of biased stochastic gradient methods for minimizing convex, strongly convex, and non-convex composite objectives, and identify settings where bias is useful in stochastic gradient estimation. The framework we present allows us to extend proximal support to biased algorithms, including SAG and SARAH, for the first time in the convex setting. We also use our framework to develop a new algorithm, Stochastic Average Recursive GradiEnt (SARGE), that achieves the oracle complexity lower-bound for non-convex, finite-sum objectives and requires strictly fewer calls to a stochastic gradient oracle per iteration than SVRG and SARAH. We support our theoretical results with numerical experiments that demonstrate the benefits of certain biased gradient estimators.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2022
    Description: Open category detection is the problem of detecting “alien" test instances that belong to categories or classes that were not present in the training data. In many applications, reliably detecting such aliens is central to ensuring the safety and accuracy of test set predictions. Unfortunately, there are no algorithms that provide theoretical guarantees on their ability to detect aliens under general assumptions. Further, while there are algorithms for open category detection, there are few empirical results that directly report alien detection rates. Thus, there are significant theoretical and empirical gaps in our understanding of open category detection. In this paper, we take a step toward addressing this gap by studying a simple, but practically-relevant variant of open category detection. In our setting, we are provided with a “clean" training set that contains only the target categories of interest and an unlabeled “contaminated” training set that contains a fraction $\alpha$ of alien examples. Under the assumption that we know an upper bound on $\alpha$, we develop an algorithm that gives PAC-style guarantees on the alien detection rate, while aiming to minimize false alarms. Given an overall budget on the amount of training data, we also derive the optimal allocation of samples between the mixture and the clean data sets. Experiments on synthetic and standard benchmark datasets evaluate the regimes in which the algorithm can be effective and provide a baseline for further advancements. In addition, for the situation when an upper bound for $\alpha$ is not available, we employ nine different anomaly proportion estimators, and run experiments on both synthetic and standard benchmark data sets to compare their performance.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2022
    Description: This paper presents solo-learn, a library of self-supervised methods for visual representation learning. Implemented in Python, using Pytorch and Pytorch lightning, the library fits both research and industry needs by featuring distributed training pipelines with mixed-precision, faster data loading via Nvidia DALI, online linear evaluation for better prototyping, and many additional training tricks. Our goal is to provide an easy-to-use library comprising a large amount of Self-supervised Learning (SSL) methods, that can be easily extended and fine-tuned by the community. solo-learn opens up avenues for exploiting large-budget SSL solutions on inexpensive smaller infrastructures and seeks to democratize SSL by making it accessible to all. The source code is available at https://github.com/vturrisi/solo-learn.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2022
    Description: We present a novel class of projected methods to perform statistical analysis on a data set of probability distributions on the real line, with the 2-Wasserstein metric. We focus in particular on Principal Component Analysis (PCA) and regression. To define these models, we exploit a representation of the Wasserstein space closely related to its weak Riemannian structure by mapping the data to a suitable linear space and using a metric projection operator to constrain the results in the Wasserstein space. By carefully choosing the tangent point, we are able to derive fast empirical methods, exploiting a constrained B-spline approximation. As a byproduct of our approach, we are also able to derive faster routines for previous work on PCA for distributions. By means of simulation studies, we compare our approaches to previously proposed methods, showing that our projected PCA has similar performance for a fraction of the computational cost and that the projected regression is extremely flexible even under misspecification. Several theoretical properties of the models are investigated, and asymptotic consistency is proven. Two real world applications to Covid-19 mortality in the US and wind speed forecasting are discussed.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    Publication Date: 2022
    Description: Many current applications in data science need rich model classes to adequately represent the statistics that may be driving the observations. Such rich model classes may be too complex to admit uniformly consistent estimators. In such cases, it is conventional to settle for estimators with guarantees on convergence rate where the performance can be bounded in a model-dependent way, i.e. pointwise consistent estimators. But this viewpoint has the practical drawback that estimator performance is a function of the unknown model within the model class that is being estimated. Even if an estimator is consistent, how well it is doing at any given time may not be clear, no matter what the sample size of the observations. In these cases, a line of analysis favors sample dependent guarantees. We explore this framework by studying rich model classes that may only admit pointwise consistency guarantees, yet enough information about the unknown model driving the observations needed to gauge estimator accuracy can be inferred from the sample at hand. In this paper we obtain a novel characterization of lossless compression problems over a countable alphabet in the data-derived framework in terms of what we term deceptive distributions. We also show that the ability to estimate the redundancy of compressing memoryless sources is equivalent to learning the underlying single-letter marginal in a data-derived fashion. We expect that the methodology underlying such characterizations in a data-derived estimation framework will be broadly applicable to a wide range of estimation problems, enabling a more systematic approach to data-derived guarantees.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2022
    Description: We show that cascaded diffusion models are capable of generating high fidelity images on the class-conditional ImageNet generation benchmark, without any assistance from auxiliary image classifiers to boost sample quality. A cascaded diffusion model comprises a pipeline of multiple diffusion models that generate images of increasing resolution, beginning with a standard diffusion model at the lowest resolution, followed by one or more super-resolution diffusion models that successively upsample the image and add higher resolution details. We find that the sample quality of a cascading pipeline relies crucially on conditioning augmentation, our proposed method of data augmentation of the lower resolution conditioning inputs to the super-resolution models. Our experiments show that conditioning augmentation prevents compounding error during sampling in a cascaded model, helping us to train cascading pipelines achieving FID scores of 1.48 at 64x64, 3.52 at 128x128 and 4.88 at 256x256 resolutions, outperforming BigGAN-deep, and classification accuracy scores of 63.02% (top-1) and 84.06% (top-5) at 256x256, outperforming VQ-VAE-2.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Publication Date: 2022
    Description: While the identification of nonlinear dynamical systems is a fundamental building block of model-based reinforcement learning and feedback control, its sample complexity is only understood for systems that either have discrete states and actions or for systems that can be identified from data generated by i.i.d. random inputs. Nonetheless, many interesting dynamical systems have continuous states and actions and can only be identified through a judicious choice of inputs. Motivated by practical settings, we study a class of nonlinear dynamical systems whose state transitions depend linearly on a known feature embedding of state-action pairs. To estimate such systems in finite time identification methods must explore all directions in feature space. We propose an active learning approach that achieves this by repeating three steps: trajectory planning, trajectory tracking, and re-estimation of the system from all available data. We show that our method estimates nonlinear dynamical systems at a parametric rate, similar to the statistical rate of standard linear regression.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2022
    Description: We propose a Bayesian pseudo posterior mechanism to generate record-level synthetic databases equipped with an $(\epsilon,\pi)-$ probabilistic differential privacy (pDP) guarantee, where $\pi$ denotes the probability that any observed database exceeds $\epsilon$. The pseudo posterior mechanism employs a data record-indexed, risk-based weight vector with weight values $\in [0, 1]$ that surgically downweight the likelihood contributions for high-risk records for model estimation and the generation of record-level synthetic data for public release. The pseudo posterior synthesizer constructs a weight for each datum record by using the Lipschitz bound for that record under a log-pseudo likelihood utility function that generalizes the exponential mechanism (EM) used to construct a formally private data generating mechanism. By selecting weights to remove likelihood contributions with non-finite log-likelihood values, we guarantee a finite local privacy guarantee for our pseudo posterior mechanism at every sample size. Our results may be applied to any synthesizing model envisioned by the data disseminator in a computationally tractable way that only involves estimation of a pseudo posterior distribution for parameters, $\theta$, unlike recent approaches that use naturally-bounded utility functions implemented through the EM. We specify conditions that guarantee the asymptotic contraction of $\pi$ to $0$ over the space of databases, such that the form of the guarantee provided by our method is asymptotic. We illustrate our pseudo posterior mechanism on the sensitive family income variable from the Consumer Expenditure Surveys database published by the U.S. Bureau of Labor Statistics. We show that utility is better preserved in the synthetic data for our pseudo posterior mechanism as compared to the EM, both estimated using the same non-private synthesizer, due to our use of targeted downweighting.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Publication Date: 2022
    Description: Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy and size. In recent years, this motivated the development of optimal classification tree algorithms that globally optimise the decision tree in contrast to heuristic methods that perform a sequence of locally optimal decisions. We follow this line of work and provide a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our algorithm supports constraints on the depth of the tree and number of nodes. The success of our approach is attributed to a series of specialised techniques that exploit properties unique to classification trees. Whereas algorithms for optimal classification trees have traditionally been plagued by high runtimes and limited scalability, we show in a detailed experimental study that our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances, providing several orders of magnitude improvements and notably contributing towards the practical use of optimal decision trees.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Publication Date: 2022
    Description: When data is plentiful, the test loss achieved by well-trained neural networks scales as a power-law $L \propto N^{-\alpha}$ in the number of network parameters $N$. This empirical scaling law holds for a wide variety of data modalities, and may persist over many orders of magnitude. The scaling law can be explained if neural models are effectively just performing regression on a data manifold of intrinsic dimension $d$. This simple theory predicts that the scaling exponents $\alpha \approx 4/d$ for cross-entropy and mean-squared error losses. We confirm the theory by independently measuring the intrinsic dimension and the scaling exponents in a teacher/student framework, where we can study a variety of $d$ and $\alpha$ by dialing the properties of random teacher networks. We also test the theory with CNN image classifiers on several datasets and with GPT-type language models.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2022
    Description: We compare the performance of six model average predictors---Mallows' model averaging, stacking, Bayes model averaging, bagging, random forests, and boosting---to the components used to form them.In all six cases we identify conditions under which the model average predictor is consistent for its intended limit and performs as well or better than any of its components asymptotically. This is well known empirically, especially for complex problems, although theoretical results do not seem to have been formally established. We have focused our attention on the regression context since that is wheremodel averaging techniques differ most often from current practice.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2022
    Description: Multinomial probit models are routinely-implemented representations for learning how the class probabilities of categorical response data change with $p$ observed predictors. Although several frequentist methods have been developed for estimation, inference and classification within such a class of models, Bayesian inference is still lagging behind. This is due to the apparent absence of a tractable class of conjugate priors, that may facilitate posterior inference on the multinomial probit coefficients. Such an issue has motivated increasing efforts toward the development of effective Markov chain Monte Carlo methods, but state-of-the-art solutions still face severe computational bottlenecks, especially in high dimensions. In this article, we show that the entire class of unified skew-normal (SUN) distributions is conjugate to several multinomial probit models. Leveraging this result and the SUN properties, we improve upon state-of-the-art solutions for posterior inference and classification both in terms of closed-form results for several functionals of interest, and also by developing novel computational methods relying either on independent and identically distributed samples from the exact posterior or on scalable and accurate variational approximations based on blocked partially-factorized representations. As illustrated in simulations and in a gastrointestinal lesions application, the magnitude of the improvements relative to current methods is particularly evident, in practice, when the focus is on high-dimensional studies.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2022
    Description: Finding parameters in a deep neural network (NN) that fit training data is a nonconvex optimization problem, but a basic first-order optimization method (gradient descent) finds a global optimizer with perfect fit (zero-loss) in many practical situations. We examine this phenomenon for the case of Residual Neural Networks (ResNet) with smooth activation functions in a limiting regime in which both the number of layers (depth) and the number of weights in each layer (width) go to infinity. First, we use a mean-field-limit argument to prove that the gradient descent for parameter training becomes a gradient flow for a probability distribution that is characterized by a partial differential equation (PDE) in the large-NN limit. Next, we show that under certain assumptions, the solution to the PDE converges in the training time to a zero-loss solution. Together, these results suggest that the training of the ResNet gives a near-zero loss if the ResNet is large enough. We give estimates of the depth and width needed to reduce the loss below a given threshold, with high probability.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2022
    Description: We propose algorithms for approximate filtering and smoothing in high-dimensional Factorial hidden Markov models. The approximation involves discarding, in a principled way, likelihood factors according to a notion of locality in a factor graph associated with the emission distribution. This allows the exponential-in-dimension cost of exact filtering and smoothing to be avoided. We prove that the approximation accuracy, measured in a local total variation norm, is "dimension-free" in the sense that as the overall dimension of the model increases the error bounds we derive do not necessarily degrade. A key step in the analysis is to quantify the error introduced by localizing the likelihood function in a Bayes' rule update. The factorial structure of the likelihood function which we exploit arises naturally when data have known spatial or network structure. We demonstrate the new algorithms on synthetic examples and a London Underground passenger flow problem, where the factor graph is effectively given by the train network.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2022
    Description: Bayesian multinomial logistic-normal (MLN) models are popular for the analysis of sequence count data (e.g., microbiome or gene expression data) due to their ability to model multivariate count data with complex covariance structure. However, existing implementations of MLN models are limited to small datasets due to the non-conjugacy of the multinomial and logistic-normal distributions. Motivated by the need to develop efficient inference for Bayesian MLN models, we develop two key ideas. First, we develop the class of Marginally Latent Matrix-T Process (Marginally LTP) models. We demonstrate that many popular MLN models, including those with latent linear, non-linear, and dynamic linear structure are special cases of this class. Second, we develop an efficient inference scheme for Marginally LTP models with specific accelerations for the MLN subclass. Through application to MLN models, we demonstrate that our inference scheme are both highly accurate and often 4-5 orders of magnitude faster than MCMC.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    Publication Date: 2022
    Description: We introduce a novel approach to estimation problems in settings with missing data. Our proposal -- the Correlation-Assisted Missing data (CAM) estimator -- works by exploiting the relationship between the observations with missing features and those without missing features in order to obtain improved prediction accuracy. In particular, our theoretical results elucidate general conditions under which the proposed CAM estimator has lower mean squared error than the widely used complete-case approach in a range of estimation problems. We showcase in detail how the CAM estimator can be applied to $U$-Statistics to obtain an unbiased, asymptotically Gaussian estimator that has lower variance than the complete-case $U$-Statistic. Further, in nonparametric density estimation and regression problems, we construct our CAM estimator using kernel functions, and show it has lower asymptotic mean squared error than the corresponding complete-case kernel estimator. We also include practical demonstrations throughout the paper using simulated data and the Terneuzen birth cohort and Brandsma datasets available from CRAN.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2022
    Description: Plug-and-Play (PnP) is a non-convex optimization framework that combines proximal algorithms, for example, the alternating direction method of multipliers (ADMM), with advanced denoising priors. Over the past few years, great empirical success has been obtained by PnP algorithms, especially for the ones that integrate deep learning-based denoisers. However, a key problem of PnP approaches is the need for manual parameter tweaking which is essential to obtain high-quality results across the high discrepancy in imaging conditions and varying scene content. In this work, we present a class of tuning-free PnP proximal algorithms that can determine parameters such as denoising strength, termination time, and other optimization-specific parameters automatically. A core part of our approach is a policy network for automated parameter search which can be effectively learned via a mixture of model-free and model-based deep reinforcement learning strategies. We demonstrate, through rigorous numerical and visual experiments, that the learned policy can customize parameters to different settings, and is often more efficient and effective than existing handcrafted criteria. Moreover, we discuss several practical considerations of PnP denoisers, which together with our learned policy yield state-of-the-art results. This advanced performance is prevalent on both linear and nonlinear exemplar inverse imaging problems, and in particular shows promising results on compressed sensing MRI, sparse-view CT, single-photon imaging, and phase retrieval.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2022
    Description: We combine two popular optimization approaches to derive learning algorithms for generative models: variational optimization and evolutionary algorithms. The combination is realized for generative models with discrete latents by using truncated posteriors as the family of variational distributions. The variational parameters of truncated posteriors are sets of latent states. By interpreting these states as genomes of individuals and by using the variational lower bound to define a fitness, we can apply evolutionary algorithms to realize the variational loop. The used variational distributions are very flexible and we show that evolutionary algorithms can effectively and efficiently optimize the variational bound. Furthermore, the variational loop is generally applicable (“black box”) with no analytical derivations required. To show general applicability, we apply the approach to three generative models (we use Noisy-OR Bayes Nets, Binary Sparse Coding, and Spike-and-Slab Sparse Coding). To demonstrate effectiveness and efficiency of the novel variational approach, we use the standard competitive benchmarks of image denoising and inpainting. The benchmarks allow quantitative comparisons to a wide range of methods including probabilistic approaches, deep deterministic and generative networks, and non-local image processing methods. In the category of “zero-shot” learning (when only the corrupted image is used for training), we observed the evolutionary variational algorithm to significantly improve the state-of-the-art in many benchmark settings. For one well-known inpainting benchmark, we also observed state-of-the-art performance across all categories of algorithms although we only train on the corrupted image. In general, our investigations highlight the importance of research on optimization methods for generative models to achieve performance improvements.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2022
    Description: We propose a novel method for training deep neural networks that are capable of interpolation, that is, driving the empirical loss to zero. At each iteration, our method constructs a stochastic approximation of the learning objective. The approximation, known as a bundle, is a pointwise maximum of linear functions. Our bundle contains a constant function that lower bounds the empirical loss. This enables us to compute an automatic adaptive learning rate, thereby providing an accurate solution. In addition, our bundle includes linear approximations computed at the current iterate and other linear estimates of the DNN parameters. The use of these additional approximations makes our method significantly more robust to its hyperparameters. Based on its desirable empirical properties, we term our method Bundle Optimisation for Robust and Accurate Training (BORAT). In order to operationalise BORAT, we design a novel algorithm for optimising the bundle approximation efficiently at each iteration. We establish the theoretical convergence of BORAT in both convex and non-convex settings. Using standard publicly available data sets, we provide a thorough comparison of BORAT to other single hyperparameter optimisation algorithms. Our experiments demonstrate BORAT matches the state-of-the-art generalisation performance for these methods and is the most robust.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    Publication Date: 2022
    Description: Sparse principal component analysis (PCA) is a popular dimensionality reduction technique for obtaining principal components which are linear combinations of a small subset of the original features. Existing approaches cannot supply certifiably optimal principal components with more than $p=100s$ of variables. By reformulating sparse PCA as a convex mixed-integer semidefinite optimization problem, we design a cutting-plane method which solves the problem to certifiable optimality at the scale of selecting $k=5$ covariates from $p=300$ variables, and provides small bound gaps at a larger scale. We also propose a convex relaxation and greedy rounding scheme that provides bound gaps of $1-2\%$ in practice within minutes for $p=100$s or hours for $p=1,000$s and is therefore a viable alternative to the exact method at scale. Using real-world financial and medical data sets, we illustrate our approach's ability to derive interpretable principal components tractably at scale.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Publication Date: 2022
    Description: In the theory of Partially Observed Markov Decision Processes (POMDPs), existence of optimal policies have in general been established via converting the original partially observed stochastic control problem to a fully observed one on the belief space, leading to a belief-MDP. However, computing an optimal policy for this fully observed model, and so for the original POMDP, using classical dynamic or linear programming methods is challenging even if the original system has finite state and action spaces, since the state space of the fully observed belief-MDP model is always uncountable. Furthermore, there exist very few rigorous value function approximation and optimal policy approximation results, as regularity conditions needed often require a tedious study involving the spaces of probability measures leading to properties such as Feller continuity. In this paper, we study a planning problem for POMDPs where the system dynamics and measurement channel model are assumed to be known. We construct an approximate belief model by discretizing the belief space using only finite window information variables. We then find optimal policies for the approximate model and we rigorously establish near optimality of the constructed finite window control policies in POMDPs under mild non-linear filter stability conditions and the assumption that the measurement and action sets are finite (and the state space is real vector valued). We also establish a rate of convergence result which relates the finite window memory size and the approximation error bound, where the rate of convergence is exponential under explicit and testable exponential filter stability conditions. While there exist many experimental results and few rigorous asymptotic convergence results, an explicit rate of convergence result is new in the literature, to our knowledge.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Publication Date: 2022
    Description: Convergence to a saddle point for convex-concave functions has been studied for decades, while recent years has seen a surge of interest in non-convex (zero-sum) smooth games, motivated by their recent wide applications. It remains an intriguing research challenge how local optimal points are defined and which algorithm can converge to such points. An interesting concept is known as the local minimax point, which strongly correlates with the widely-known gradient descent ascent algorithm. This paper aims to provide a comprehensive analysis of local minimax points, such as their relation with other solution concepts and their optimality conditions. We find that local saddle points can be regarded as a special type of local minimax points, called uniformly local minimax points, under mild continuity assumptions. In (non-convex) quadratic games, we show that local minimax points are (in some sense) equivalent to global minimax points. Finally, we study the stability of gradient algorithms near local minimax points. Although gradient algorithms can converge to local/global minimax points in the non-degenerate case, they would often fail in general cases. This implies the necessity of either novel algorithms or concepts beyond saddle points and minimax points in non-convex smooth games.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2022
    Description: As a popular meta-learning approach, the model-agnostic meta-learning (MAML) algorithm has been widely used due to its simplicity and effectiveness. However, the convergence of the general multi-step MAML still remains unexplored. In this paper, we develop a new theoretical framework to provide such convergence guarantee for two types of objective functions that are of interest in practice: (a) resampling case (e.g., reinforcement learning), where loss functions take the form in expectation and new data are sampled as the algorithm runs; and (b) finite-sum case (e.g., supervised learning), where loss functions take the finite-sum form with given samples. For both cases, we characterize the convergence rate and the computational complexity to attain an $\epsilon$-accurate solution for multi-step MAML in the general nonconvex setting. In particular, our results suggest that an inner-stage stepsize needs to be chosen inversely proportional to the number $N$ of inner-stage steps in order for $N$-step MAML to have guaranteed convergence. From the technical perspective, we develop novel techniques to deal with the nested structure of the meta gradient for multi-step MAML, which can be of independent interest.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    Publication Date: 2022
    Description: We propose a new tool for visualizing complex, and potentially large and high-dimensional, data sets called Centroid-Encoder (CE). The architecture of the Centroid-Encoder is similar to the autoencoder neural network but it has a modified target, i.e., the class centroid in the ambient space. As such, CE incorporates label information and performs a supervised data visualization. The training of CE is done in the usual way with a training set whose parameters are tuned using a validation set. The evaluation of the resulting CE visualization is performed on a sequestered test set where the generalization of the model is assessed both visually and quantitatively. We present a detailed comparative analysis of the method using a wide variety of data sets and techniques, both supervised and unsupervised, including NCA, non-linear NCA, t-distributed NCA, t-distributed MCML, supervised UMAP, supervised PCA, Colored Maximum Variance Unfolding, supervised Isomap, Parametric Embedding, supervised Neighbor Retrieval Visualizer, and Multiple Relational Embedding. An analysis of variance using PCA demonstrates that a non-linear preprocessing by the CE transformation of the data captures more variance than PCA by dimension.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Publication Date: 2022
    Description: We develop a rigorous and general framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs), such as the $1$-Wasserstein distance. We prove under which assumptions these divergences, hereafter referred to as $(f,\Gamma)$-divergences, provide a notion of `distance' between probability measures and show that they can be expressed as a two-stage mass-redistribution/mass-transport process. The $(f,\Gamma)$-divergences inherit features from IPMs, such as the ability to compare distributions which are not absolutely continuous, as well as from $f$-divergences, namely the strict concavity of their variational representations and the ability to control heavy-tailed distributions for particular choices of $f$. When combined, these features establish a divergence with improved properties for estimation, statistical learning, and uncertainty quantification applications. Using statistical learning as an example, we demonstrate their advantage in training generative adversarial networks (GANs) for heavy-tailed, not-absolutely continuous sample distributions. We also show improved performance and stability over gradient-penalized Wasserstein GAN in image generation.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2022
    Description: In this paper, we propose a flexible model for survival analysis using neural networks along with scalable optimization algorithms. One key technical challenge for directly applying maximum likelihood estimation (MLE) to censored data is that evaluating the objective function and its gradients with respect to model parameters requires the calculation of integrals. To address this challenge, we recognize from a novel perspective that the MLE for censored data can be viewed as a differential-equation constrained optimization problem. Following this connection, we model the distribution of event time through an ordinary differential equation and utilize efficient ODE solvers and adjoint sensitivity analysis to numerically evaluate the likelihood and the gradients. Using this approach, we are able to 1) provide a broad family of continuous-time survival distributions without strong structural assumptions, 2) obtain powerful feature representations using neural networks, and 3) allow efficient estimation of the model in large-scale applications using stochastic gradient descent. Through both simulation studies and real-world data examples, we demonstrate the effectiveness of the proposed method in comparison to existing state-of-the-art deep learning survival analysis models. The implementation of the proposed SODEN approach has been made publicly available at https://github.com/jiaqima/SODEN.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    Publication Date: 2022
    Description: High resolution geospatial data are challenging because standard geostatistical models based on Gaussian processes are known to not scale to large data sizes. While progress has been made towards methods that can be computed more efficiently, considerably less attention has been devoted to methods for large scale data that allow the description of complex relationships between several outcomes recorded at high resolutions by different sensors. Our Bayesian multivariate regression models based on spatial multivariate trees (SpamTrees) achieve scalability via conditional independence assumptions on latent random effects following a treed directed acyclic graph. Information-theoretic arguments and considerations on computational efficiency guide the construction of the tree and the related efficient sampling algorithms in imbalanced multivariate settings. In addition to simulated data examples, we illustrate SpamTrees using a large climate data set which combines satellite data with land-based station data. Software and source code are available on CRAN at https://CRAN.R-project.org/package=spamtree.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2022
    Description: In rank aggregation (RA), a collection of preferences from different users are summarized into a total order under the assumption of homogeneity of users. Model misspecification in RA arises since the homogeneity assumption fails to be satisfied in the complex real-world situation. Existing robust RAs usually resort to an augmentation of the ranking model to account for additional noises, where the collected preferences can be treated as a noisy perturbation of idealized preferences. Since the majority of robust RAs rely on certain perturbation assumptions, they cannot generalize well to agnostic noise-corrupted preferences in the real world. In this paper, we propose CoarsenRank, which possesses robustness against model misspecification. Specifically, the properties of our CoarsenRank are summarized as follows: (1) CoarsenRank is designed for mild model misspecification, which assumes there exist the ideal preferences (consistent with model assumption) that locate in a neighborhood of the actual preferences. (2) CoarsenRank then performs regular RAs over a neighborhood of the preferences instead of the original data set directly. Therefore, CoarsenRank enjoys robustness against model misspecification within a neighborhood. (3) The neighborhood of the data set is defined via their empirical data distributions. Further, we put an exponential prior on the unknown size of the neighborhood and derive a much-simplified posterior formula for CoarsenRank under particular divergence measures. (4) CoarsenRank is further instantiated to Coarsened Thurstone, Coarsened Bradly-Terry, and Coarsened Plackett-Luce with three popular probability ranking models. Meanwhile, tractable optimization strategies are introduced with regards to each instantiation respectively. In the end, we apply CoarsenRank on four real-world data sets. Experiments show that CoarsenRank is fast and robust, achieving consistent improvements over baseline methods.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2022
    Description: The rapid development of high-throughput technologies has enabled the generation of data from biological or disease processes that span multiple layers, like genomic, proteomic or metabolomic data, and further pertain to multiple sources, like disease subtypes or experimental conditions. In this work, we propose a general statistical framework based on Gaussian graphical models for horizontal (i.e. across conditions or subtypes) and vertical (i.e. across different layers containing data on molecular compartments) integration of information in such datasets. We start with decomposing the multi-layer problem into a series of two-layer problems. For each two-layer problem, we model the outcomes at a node in the lower layer as dependent on those of other nodes in that layer, as well as all nodes in the upper layer. We use a combination of neighborhood selection and group-penalized regression to obtain sparse estimates of all model parameters. Following this, we develop a debiasing technique and asymptotic distributions of inter-layer directed edge weights that utilize already computed neighborhood selection coefficients for nodes in the upper layer. Subsequently, we establish global and simultaneous testing procedures for these edge weights. Performance of the proposed methodology is evaluated on synthetic and real data.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    Publication Date: 2022
    Description: This work studies finite-sample properties of the risk of the minimum-norm interpolating predictor in high-dimensional regression models. If the effective rank of the covariance matrix $\Sigma$ of the $p$ regression features is much larger than the sample size $n$, we show that the min-norm interpolating predictor is not desirable, as its risk approaches the risk of trivially predicting the response by 0. However, our detailed finite-sample analysis reveals, surprisingly, that this behavior is not present when the regression response and the features are jointly low-dimensional, following a widely used factor regression model. Within this popular model class, and when the effective rank of $\Sigma$ is smaller than $n$, while still allowing for $p \gg n$, both the bias and the variance terms of the excess risk can be controlled, and the risk of the minimum-norm interpolating predictor approaches optimal benchmarks. Moreover, through a detailed analysis of the bias term, we exhibit model classes under which our upper bound on the excess risk approaches zero, while the corresponding upper bound in the recent work arXiv:1906.11300 diverges. Furthermore, we show that the minimum-norm interpolating predictor analyzed under the factor regression model, despite being model-agnostic and devoid of tuning parameters, can have similar risk to predictors based on principal components regression and ridge regression, and can improve over LASSO based predictors, in the high-dimensional regime.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2022
    Description: Game-theoretic attribution techniques based on Shapley values are used to interpret black-box machine learning models, but their exact calculation is generally NP-hard, requiring approximation methods for non-trivial models. As the computation of Shapley values can be expressed as a summation over a set of permutations, a common approach is to sample a subset of these permutations for approximation. Unfortunately, standard Monte Carlo sampling methods can exhibit slow convergence, and more sophisticated quasi-Monte Carlo methods have not yet been applied to the space of permutations. To address this, we investigate new approaches based on two classes of approximation methods and compare them empirically. First, we demonstrate quadrature techniques in a RKHS containing functions of permutations, using the Mallows kernel in combination with kernel herding and sequential Bayesian quadrature. The RKHS perspective also leads to quasi-Monte Carlo type error bounds, with a tractable discrepancy measure defined on permutations. Second, we exploit connections between the hypersphere $\mathbb{S}^{d-2}$ and permutations to create practical algorithms for generating permutation samples with good properties. Experiments show the above techniques provide significant improvements for Shapley value estimates over existing methods, converging to a smaller RMSE in the same number of model evaluations.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2022
    Description: Algorithm parameters, in particular hyperparameters of machine learning algorithms, can substantially impact their performance. To support users in determining well-performing hyperparameter configurations for their algorithms, datasets and applications at hand, SMAC3 offers a robust and flexible framework for Bayesian Optimization, which can improve performance within a few evaluations. It offers several facades and pre-sets for typical use cases, such as optimizing hyperparameters, solving low dimensional continuous (artificial) global optimization problems and configuring algorithms to perform well across multiple problem instances. The SMAC3 package is available under a permissive BSD-license at https://github.com/automl/SMAC3.
    Print ISSN: 1532-4435
    Electronic ISSN: 1533-7928
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
  • 82
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
  • 85
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    Publication Date: 2022-02-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
  • 100
    Publication Date: 2022-01-01
    Print ISSN: 0040-1625
    Electronic ISSN: 1873-5509
    Topics: Geography , Sociology , Technology
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...