ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Ihre E-Mail wurde erfolgreich gesendet. Bitte prüfen Sie Ihren Maileingang.

Leider ist ein Fehler beim E-Mail-Versand aufgetreten. Bitte versuchen Sie es erneut.

Vorgang fortführen?

Exportieren
Filter
  • Artikel  (214)
  • Institute of Electrical and Electronics Engineers (IEEE)  (214)
  • Molecular Diversity Preservation International
  • 2020-2022
  • 2015-2019
  • 2010-2014  (214)
  • 1990-1994
  • 1945-1949
  • 2013  (214)
  • IEEE Transactions on Computers (T-C)  (214)
  • 1288
  • Informatik  (214)
  • Wirtschaftswissenschaften
Sammlung
  • Artikel  (214)
Verlag/Herausgeber
  • Institute of Electrical and Electronics Engineers (IEEE)  (214)
  • Molecular Diversity Preservation International
Erscheinungszeitraum
  • 2020-2022
  • 2015-2019
  • 2010-2014  (214)
  • 1990-1994
  • 1945-1949
Jahr
Thema
  • Informatik  (214)
  • Wirtschaftswissenschaften
  • 1
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: File replication is widely used in structured P2P systems to avoid hot spots in servers and enhance file availability. The number of replicas and replication distance affect the file replication cost. These two elements and the replica update frequency determined in the file replication stage also affect the cost of subsequent consistency maintenance. However, most existing file replication protocols focus on improving file lookup efficiency without considering its cost and its subsequent influence on consistency maintenance. This paper studies the problem about how a server chooses files to replicate and where to replicate files to achieve low cost in both file replication and consistency maintenance stages without compromising the effectiveness of file replication. This paper presents a lightweight and Cooperative multifactOr considered file Replication Protocol (CORP) to achieve this goal. CORP simultaneously takes into account multiple factors including file popularity, update rate, node available capacity, file load, and node locality, aiming to minimize the number of replicas, update frequency, and replication distance. CORP also dynamically adjusts the number of replicas based on ever-changing file popularity and visit pattern. Extensive experimental results from simulation and PlanetLab real-world testbed demonstrate the efficiency and effectiveness of CORP in comparison with other file replication protocols. It dramatically reduces the overhead of both file replication and consistency maintenance. In addition, it exhibits high adaptiveness to skewed lookups and yields significant improvement in reducing overloaded nodes. Specifically, compared to the other replication protocols, CORP can reduce more than 71 percent of file replicas, 84 percent of overloaded nodes, 94 percent of consistency maintenance cost, and 72 percent of file replication and consistency maintenance latency.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 2
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: Power consumption has become a limiting factor in designing next generation network routers. Recent observation shows that IP lookup engines dominate the power consumption of core routers. Previous work on reducing power consumption of routers mainly focused on network- and system-level optimizations. This paper represents the first thorough study on the data structure optimization for lowering the power consumption in static random access memory (SRAM)-based IP lookup engines. Three different SRAM-based IP lookup architectures are discussed: nonpipelined, simple pipelined, and memory-balanced pipelined architectures. For each architecture, we formulate the problem of power minimization by revisiting the time-space tradeoff in multibit tries. Two distinct multibit trie algorithms are investigated: the expanded trie and the tree bitmap trie, which are widely used in SRAM-based IP lookup solutions. A theoretical framework is proposed to determine the optimal strides for building a multibit trie so that the worst-case power consumption of the IP lookup architecture is minimized. Experiments using real-life routing tables including both IPv4 and IPv6 data sets demonstrate that careful selection of strides in building the multibit tries can reduce the power consumption dramatically. We believe our methodology can be applied to other variants of multibit tries and can help in designing more power-efficient SRAM-based IP lookup architectures.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 3
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: As new transport protocols are being proposed and standardized, the choice of the best communication service to be used by applications for delivering their data when distributed is becoming too complex. Application developers need much knowledge on "how the protocol worksâ to decide whether or not it can be used to fulfill their requirements. Moreover, the performance of the service provided by a given communication protocol is highly dependent on the network context. The Autonomic Transport Protocol presented in this paper is aware of the application requirements and uses learning techniques to adapt the service it provides to best satisfy these requirements as the network conditions vary.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 4
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: Hardware Trojan attack in the form of malicious modification of a design has emerged as a major security threat. Side-channel analysis has been investigated as an alternative to conventional logic testing to detect the presence of hardware Trojans. However, these techniques suffer from decreased sensitivity toward small Trojans, especially because of the large process variations present in modern nanometer technologies. In this paper, we propose a novel noninvasive, multiple-parameter side-channel analysis-based Trojan detection approach. We use the intrinsic relationship between dynamic current and maximum operating frequency of a circuit to isolate the effect of a Trojan circuit from process noise. We propose a vector generation approach and several design/test techniques to improve the detection sensitivity. Simulation results with two large circuits, a 32-bit integer execution unit (IEU) and a 128-bit advanced encryption standard (AES) cipher, show a detection resolution of 1.12 percent amidst $(pm 20)$ percent parameter variations. The approach is also validated with experimental results. Finally, the use of a combined side-channel analysis and logic testing approach is shown to provide high overall detection coverage for hardware Trojan circuits of varying types and sizes.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 5
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: Public-key encryption with keyword search (PEKS) is a versatile tool. It allows a third party knowing the search trapdoor of a keyword to search encrypted documents containing that keyword without decrypting the documents or knowing the keyword. However, it is shown that the keyword will be compromised by a malicious third party under a keyword guess attack (KGA) if the keyword space is in a polynomial size. We address this problem with a keyword privacy enhanced variant of PEKS referred to as public-key encryption with fuzzy keyword search (PEFKS). In PEFKS, each keyword corresponds to an exact keyword search trapdoor and a fuzzy keyword search trapdoor. Two or more keywords share the same fuzzy keyword trapdoor. To search encrypted documents containing a specific keyword, only the fuzzy keyword search trapdoor is provided to the third party, i.e., the searcher. Thus, in PEFKS, a malicious searcher can no longer learn the exact keyword to be searched even if the keyword space is small. We propose a universal transformation which converts any anonymous identity-based encryption (IBE) scheme into a secure PEFKS scheme. Following the generic construction, we instantiate the first PEFKS scheme proven to be secure under KGA in the case that the keyword space is in a polynomial size.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 6
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: This paper presents techniques for low-power addition/subtraction in the logarithmic number system (LNS) and quantifies their impact on digital filter VLSI implementation. The impact of partitioning the look-up tables required for LNS addition/subtraction on complexity, performance, and power dissipation of the corresponding circuits is quantified. Two design parameters are exploited to minimize complexity, namely the LNS base and the organization of the LNS word. A roundoff noise model is used to demonstrate the impact of base and word length on the signal-to-noise ratio of the output of finite impulse response (FIR) filters. In addition, techniques for the low-power implementation of an LNS multiply accumulate (MAC) units are investigated. Furthermore, it is shown that the proposed techniques can be extended to cotransformation-based circuits that employ interpolators. The results are demonstrated by evaluating the power dissipation, complexity and performance of several FIR filter configurations comprising one, two or four MAC units. Simulations of placed and routed VLSI LNS-based digital filters using a 90-nm 1.0 V CMOS standard-cell library reveal that significant power dissipation savings are possible by using optimized LNS circuits at no performance penalty, when compared to linear fixed-point two's-complement equivalents.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 7
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: NAND flash-based storage device is becoming a viable storage solution for mobile and desktop systems. Because of the erase-before-write nature, flash-based storage devices require garbage collection that causes significant performance degradation, incurring a large number of page migrations and block erasures. To improve I/O performance, therefore, it is important to develop an efficient garbage collection algorithm. In this paper, we propose a novel garbage collection technique, called buffer-aware garbage collection (BAGC), for flash-based storage devices. The BAGC improves the efficiency of two main steps of garbage collection, a block merge step and a victim block selection step, by taking account of the contents of a buffer cache, which is typically used to enhance I/O performance. The buffer-aware block merge (BABM) scheme eliminates unnecessary page migrations by evicting dirty data from a buffer cache during a block merge step. The buffer-aware victim block selection (BAVBS) scheme, on the other hand, selects a victim block so that the benefit of the buffer-aware block merge is maximized. Our experimental results show that BAGC improves I/O performance by up to 43 percent over existing buffer-unaware schemes for various benchmarks.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 8
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: Workflow-based workloads usually consist of multiple instances of the same workflow, which are jobs with control or data dependencies to carry out a well-defined scientific computation task, with each instance acting on its own input data. To maximize the performance, a high degree of concurrency is always achieved by running multiple instances simultaneously. However, since the amount of storage is limited on most systems, deadlock due to oversubscribed storage requests is a potential problem. To address this problem, we integrate two novel concepts with the traditional problem of deadlock avoidance by proposing two algorithms that can maximize active (not just allocated) resource utilization and minimize makespan. Our approach is based on the well-known banker's algorithm, but our algorithms make the important distinction between active and inactive resources, which is not a part of previous approaches. The central idea is to leverage the data-flow information to dynamically approximate localized maximum claim (i.e., the resource requirements of the remaining jobs of the instance) to improve either interinstance or intrainstance concurrency and still avoid deadlock. Through simulation-based studies, we show how our proposed algorithms are better than the classic banker's algorithm and the more recent Lang's algorithm in terms of makespan and active storage resource utilization.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 9
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: Security of today's networks heavily rely on network intrusion detection systems (NIDSs). The ability to promptly update the supported rule sets and detect new emerging attacks makes field-programmable gate arrays (FPGAs) a very appealing technology. An important issue is how to scale FPGA-based NIDS implementations to ever faster network links. Whereas a trivial approach is to balance traffic over multiple, but functionally equivalent, hardware blocks, each implementing the whole rule set (several thousands rules), the obvious cons is the linear increase in the resource occupation. In this work, we promote a different, traffic-aware, modular approach in the design of FPGA-based NIDS. Instead of purely splitting traffic across equivalent modules, we classify and group homogeneous traffic, and dispatch it to differently capable hardware blocks, each supporting a (smaller) rule set tailored to the specific traffic category. We implement and validate our approach using the rule set of the well-known Snort NIDS, and we experimentally investigate the emerging trade-offs and advantages, showing resource savings up to 80 percent based on real-world traffic statistics gathered from an operator's backbone.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 10
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: Reliability evaluation of interconnection network is important to the design and maintenance of multiprocessor systems. Extra connectivity determination and faulty networks' structure analysis are two important aspects for the reliability evaluation of interconnection networks. An $(n)$-dimensional bijective connection network (in brief, BC network), denoted by $(X_n)$, is an $(n)$-regular graph with $(2^{n})$ vertices and $(n2^{n-1})$ edges. The hypercubes, M$(ddot{o})$bius cubes, crossed cubes, and twisted cubes are some examples of the BC networks. By exploring the boundary problem of the BC networks, we prove that when $(nge 4)$ and $(0le hle n-4)$ the $(h)$-extra connectivity of an $(n)$-dimensional BC network $(X_n)$ is $(kappa_{h}(X_n)=)$ $(n(h+1)-{1over 2} h(h+3))$. Furthermore, there exists a large connected component and the remaining small components have at most $(h)$ vertices in total if the total number of faulty vertices is strictly less its $(h)$-extra connectivity. As an application, the results on the $(h)$-extra connectivity and structure of faulty networks on hypercubes, M$(ddot{o})$bius cubes, crossed cubes, and twisted cubes are obtained.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 11
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: New applications based on cloud computing, such as data synchronization for large chain departmental stores and bank transaction records, require very high-speed data transport. Although a number of high-bandwidth networks have been built, existing transport protocols or their variants over such networks cannot fully exploit the network bandwidth. Our experiments show that the fixed-size application level buffer employed in the receiver side is a major cause of this deficiency. A buffer that is either too small or too large impairs the transfer performance. Due to the varied natures of network conditions and of real-time packet processing (i.e., consuming) speed at the receiver, it is important to ensure that the buffer size is dynamically adjusted according to the perceived execution situation during runtime. In this paper, we propose Rada, a dynamic receiving buffer adaptation scheme for high-speed data transfer. Rada employs an exponential moving average aided scheme to quantify the data arrival rate and consumption rate in the buffer. Based on these two rates, we develop a linear aggressive increase conservative decrease scheme to adjust the buffer size dynamically. Moreover, a weighted mean function is employed to make the adjustment adaptive to the available memory in the receiver. Theoretical analysis is provided to demonstrate the rationale and parameter bounds of Rada. The performance of Rada is also theoretically compared with potential alternatives. We implement Rada in a Linux platform and extensively evaluate its performance in a variety of scenarios. Experimental results conform to the theoretical results, and show that Rada outperforms the static buffer scheme in terms of throughput, memory footprint, and fairness.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 12
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: This paper introduces SymPLFIED, a program-level framework that allows specification of arbitrary error detectors and the verification of their efficacy against hardware errors. SymPLFIED comprehensively enumerates all transient hardware errors in registers, memory, and computation (expressed symbolically as value errors) that potentially evade detection and cause program failure. The framework uses symbolic execution to abstract the state of erroneous values in the program and model checking to comprehensively find all errors that evade detection. We demonstrate the use of SymPLFIED on a widely deployed aircraft collision avoidance application, tcas. Our results show that the SymPLFIED framework can be used to uncover hard-to-detect catastrophic cases caused by transient errors in programs that may not be exposed by random fault injection-based validation. Further, the errors exposed by the framework help us formulate a set of error detectors for the application to avoid the catastrophic case and other incorrect outcomes.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 13
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: In delay tolerant networks (DTNs), the lack of continuous connectivity, network partitioning, and long delays make design of network protocols very challenging. Previous DTN research mainly focuses on routing and information propagation. However, with a large number of wireless devices' participation, it becomes crucial regarding how to maintain efficient and dynamic topology of the DTN. In this paper, we study the topology control problem in a predictable DTN, where the time-evolving network topology is known a priori or can be predicted. We first model such time-evolving network as a directed space-time graph that includes both spacial and temporal information. The aim of topology control is to build a sparse structure from the original space-time graph such that 1) the network is still connected over time and supports DTN routing between any two nodes; 2) the total cost of the structure is minimized. We prove that this problem is NP-hard, and propose two greedy-based methods that can significantly reduce the total cost of topology while maintaining the connectivity over time. We also introduce another version of the topology control problem by requiring that the least cost path for any two nodes in this constructed structure is still cost-efficient compared with the one in the original graph. Two greedy-based methods are provided for such a problem. Simulations have been conducted on both random DTN networks and real-world DTN tracing data. Results demonstrate the efficiency of the proposed methods.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 14
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: To achieve secure group communication, one-time session keys need to be shared among group members in a secure and authenticated manner. In this paper, we propose an improved authenticated key transfer protocol based on Shamir's secret sharing. The proposed protocol achieves key confidentiality due to security of Shamir's secret sharing and provides key authentication by broadcasting a single authentication message to all members. Furthermore, the proposed scheme resists against both insider and outsider attacks.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 15
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: Recent cost analysis shows that the server cost still dominates the total cost of high-scale data centers or cloud systems. In this paper, we argue for a new twist on the classical resource provisioning problem: heterogeneous workloads are a fact of life in large-scale data centers, and current resource provisioning solutions do not act upon this heterogeneity. Our contributions are threefold: first, we propose a cooperative resource provisioning solution, and take advantage of differences of heterogeneous workloads so as to decrease their peak resources consumption under competitive conditions; second, for four typical heterogeneous workloads: parallel batch jobs, web servers, search engines, and MapReduce jobs, we build an agile system PhoenixCloud that enables cooperative resource provisioning; and third, we perform a comprehensive evaluation for both real and synthetic workload traces. Our experiments show that our solution could save the server cost aggressively with respect to the noncooperative solutions that are widely used in state-of-the-practice hosting data centers or cloud systems: for example, EC2, which leverages the statistical multiplexing technique, or RightScale, which roughly implements the elastic resource provisioning technique proposed in related state-of-the-art work.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 16
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: A flash translation layer (FTL) provides file systems with transparent access to NAND flash memory. Although many applications running on it require real-time guarantees, it is difficult to provide tight worst case execution time (WCET) bounds with conventional static WCET analysis since an FTL exhibits a large variance in execution time depending on its runtime state. Parametric WCET analysis could be an effective alternative but it is also challenging to formulate a parametric WCET function for an FTL program because traditional FTL architecture does not properly model the runtime availability of flash resources in its code structure. To overcome such a limitation, we propose Petri net-based FTL architecture where a Petri net explicitly specifies dependencies between FTL operations and the runtime resource availability. It comes with an FTL operation sequencer that derives at runtime the shortest sequence of FTL operations for servicing an incoming FTL request under the current resource availability. The sequencer computes the WCET of the request by merely summing the WCETs of only those FTL operations in the sequence. Our experimental results show the effectiveness of our FTL architecture. It allowed for tight WCET estimation that yielded WCETs shorter by a factor of 54 than statically analyzed ones.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 17
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: Process variations in integrated circuits have significant impact on their performance, leakage, and stability. This is particularly evident in large, regular, and dense structures such as DRAMs. DRAMs are built using minimized transistors with presumably uniform speed in an organized array structure. Process variation can introduce latency disparity among different memory arrays. With the proliferation of 3D stacking technology, DRAMs become a favorable choice for stacking on top of a multicore processor as a last level cache for large capacity, high bandwidth, and low power. Hence, variations in bank speed create a unique problem of nonuniform cache accesses in 3D space. In this paper, we investigate cache management techniques for tolerating process variation in a 3D DRAM stacked onto a multicore processor. We modeled the process variation in a four-layer DRAM memory, including cell transistor, capacitor trench, and peripheral circuit, to characterize the latency and retention time variations among different banks. As a result, the notion of fast and slow banks from the core's standpoint is no longer associated with their physical distances with the banks. They are determined by the different bank latencies due to process variation. We develop cache migration schemes that utilize fast banks while limiting the cost due to migration. Our experiments show that there is a great performance benefit in exploiting fast memory banks through migration. On average, a variation-aware management can improve the performance of a workload over the baseline (where one of the slowest bank speed is assumed for all banks) by 16.5 percent. We are also only 0.8 percent away in performance from an ideal memory where no process variation is present.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 18
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-09-28
    Beschreibung: We address pairwise and (for the first time) triple key establishment problems in wireless sensor networks (WSN). Several types of combinatorial designs have already been applied in key establishment. A $(BIBD(v,b,r,k,lambda ))$ (or $(t-(v,b,r,k,lambda ))$ design) can be mapped to a sensor network, where $(v)$ represents the size of the key pool, $(b)$ represents the maximum number of nodes that the network can support, and $(k)$ represents the size of the key chain. Any pair (or $(t)$-subset) of keys occurs together uniquely in exactly $(lambda)$ nodes; $(lambda = 2)$ and $(lambda = 3)$ are used to establish unique pairwise or triple keys. We use several known constructions of designs with $(lambda =2)$, to predistribute keys in sensors. We also describe a new construction of a design called strong Steiner trade and use it for pairwise key establishment. To the best of our knowledge, this is the first paper on application of trades to key distribution. Our scheme is highly resilient against node capture attacks (achieved by key refreshing) and is applicable for mobile sensor networks (as key distribution is independent on the connectivity graph), while preserving low storage, computation and communication requirements. We introduce a novel concept of triple key distribution, in which three nodes share common keys, and discuss its application in secure forwarding, detecting malicious nodes and key management in clustered sensor networks. We present a polynomial-based and a combinatorial approach (using trades) for triple key distribution. We also extend our construction to simultaneously provide pairwise and triple key distribution scheme, and apply it to secure data aggregation.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 19
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: The present paper proposes a generalization of the square root rule for optimal periodic scheduling. The rule defines a ratio of item occurrences in a schedule, which minimizes the mean serving time. However, the actual number of each item's occurrences must be an integer. Therefore, the square root rule assumes large schedules, in order for the ratio to hold with acceptable precision. The present paper introduces an analysis-derived formula which connects the mean serving time and the size of the schedule. The relation shows that small schedules can also achieve near-optimal serving times. The analysis is validated through comparison with simulation and brute force-derived results. Finally, it is shown that minimizing the size of the schedule is also an efficient way of optimizing the aggregate scheduling cost.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 20
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: Numerous works have addressed efficient parallel $(GF(2^m))$ multiplication based on polynomial basis or some of its variants. For those field degrees where neither irreducible trinomials nor Equally Spaced Polynomials (EPSs) exist, the best area/time performance has been achieved for special-type irreducible pentanomials, which however do not exist for all degrees. In other words, no multiplier architecture has been proposed so far achieving the best performance and, at the same time, being general enough to support any field degrees. In this paper, we propose a new representation, based on what we called Generalized Polynomial Bases (GPBs), covering polynomial bases and the so-called Shifted Polynomial Bases (SPBs) as special cases. In order to study the new representation, we introduce a novel formulation for polynomial basis and its variants, which is able to express concisely all implementation aspects of interest, i.e., gate count, subexpression sharing, and time delay. The methodology enabled by the new formulation is completely general and repetitive in its application, allowing the development of an ad-hoc software tool to derive proofs for area complexity and time delays automatically. As the central contribution of this paper, we introduce some new types of irreducible pentanomials and an associated GPB. Based on the above formulation, we prove that carefully chosen GPBs yield multiplier architectures matching, or even outperforming, the best special-type pentanomials from both the area and time point of view. Most importantly, the proposed GPB architectures require pentanomials existing for all degrees of practical interest. A list of suitable irreducible pentanomials for all degrees less than 1,000 is given in the appendix (Fig. 5 and Tables 4-11 are provided in a separate file containing the body of Appendix, which can be found on the Computer Society Digital Library at >http://doi.ieeecomputersociety.org/10.1109/TC.2012.63).
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 21
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: Low complexity solutions to provide deterministic quality over packet switched networks while achieving high resource utilization have been an open research issue for many years. Service differentiation combined with resource overprovisioning has been considered an acceptable compromise and widely deployed given that the amount of traffic requiring quality guarantees has been limited. This approach is not viable, though, as new bandwidth hungry applications, such as video on demand, telepresence, and virtual reality, populate networks invalidating the rationale that made it acceptable so far. Time-driven priority represents a potentially interesting solution. However, the fact that the network operation is based on a time reference shared by all nodes raises concerns on the complexity of the nodes, from the point of view of both their hardware and software architecture. This work analyzes the implications that the timing requirements of time-driven priority have on network nodes and shows how proper operation can be ensured even when system components introduce timing uncertainties. Experimental results on a time-driven priority router implementation based on a personal computer both validate the analysis and demonstrate the feasibility of the technology even on an architecture that is not designed for operating under timing constraints.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 22
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: In recent years, we have experienced a wave of DDoS attacks threatening the welfare of the internet. These are launched by malicious users whose only incentive is to degrade the performance of other, innocent, users. The traditional systems turn out to be quite vulnerable to these attacks. The objective of this work is to take a first step to close this fundamental gap, aiming at laying a foundation that can be used in future computer/network designs taking into account the malicious users. Our approach is based on proposing a metric that evaluates the vulnerability of a system. We then use our vulnerability metric to evaluate a data structure which is commonly used in network mechanisms—the Hash table data structure. We show that Closed Hash is much more vulnerable to DDoS attacks than Open Hash, even though the two systems are considered to be equivalent by traditional performance evaluation. We also apply the metric to queuing mechanisms common to computer and communications systems. Furthermore, we apply it to the practical case of a hash table whose requests are controlled by a queue, showing that even after the attack has ended, the regular users still suffer from performance degradation or even a total denial of service.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 23
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: In Network Intrusion Detection Systems (NIDSs), string pattern matching demands exceptionally high performance to match the content of network traffic against a predefined database (or dictionary) of malicious patterns. Much work has been done in this field; however, most of the prior work results in low memory efficiency (defined as the ratio of the amount of the required storage in bytes and the size of the dictionary in number of characters). Due to such inefficiency, state-of-the-art designs cannot support large dictionaries without using high-latency external DRAM. We propose an algorithm called "leaf-attaching" to preprocess a given dictionary without increasing the number of patterns. The resulting set of postprocessed patterns can be searched using any tree-search data structure. We also present a scalable, high-throughput, Memory-efficient Architecture for large-scale String Matching (MASM) based on a pipelined binary search tree. The proposed algorithm and architecture achieve a memory efficiency of 0.56 (for the Rogets dictionary) and 1.32 (for the Snort dictionary). As a result, our design scales well to support larger dictionaries. Implementations on 45 nm ASIC and a state-of-the-art FPGA device (for latest Rogets and Snort dictionaries) show that our architecture achieves 24 and 3.2 Gbps, respectively. The MASM module can simply be duplicated to accept multiple characters per cycle, leading to scalable throughput with respect to the number of characters processed in each cycle. Dictionary update involves simply rewriting the content of the memory, which can be done quickly without reconfiguring the chip.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 24
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: In this paper, an analytic model is proposed for the performance evaluation of vehicular safety related services in the dedicated short range communications (DSRC) system on highways. The generation and service of safety messages in each vehicle is modeled by a generalized M/G/1 queue. The overall model is a set of interacting M/G/1 queues, one queue for each vehicle. The interaction is that the server is shared as it is the contention medium. To make the model scalable, we use semi-Markov process (SMP) model to capture the shared server's behavior from one tagged vehicle's perspective, where the medium contention and back off behavior for this vehicle and influences from other vehicles are considered. Furthermore, this SMP interacts with the tagged vehicle's own M/G/1 queue through fixed-point iteration. The proof for the existence, uniqueness and convergence of the fixed point is provided. Based on the fixed-point solution, performance indices including mean transmission delay, packet delivery ratio (PDR), and packet reception ratio (PRR) are derived. Analytic-numeric results are verified through extensive simulations under various network parameters. Compared with the existing models, the proposed SMP model facilitates the impact analysis of hidden terminal problem on the PDR and PRR computation in a more precise manner.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 25
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: We present a comprehensive, self-contained, and mechanically verified proof of correctness of a maximally redundant SRT design for floating-point division and square root extraction, supported by verified procedures that 1) test the admissibility of a proposed digit selection table, 2) determine the minimal dimensions of an admissible table for a given arbitrary radix, and 3) generate these tables. For square root extraction, we also provide a verified procedure for generating an initial approximation that meets the accuracy requirement of the algorithm and ensures that the digit selection index derived from successive partial roots remains static throughout the computation. A radix-8 instantiation of these algorithms has been implemented in the floating-point unit of the AMD processor code-named Steamroller. To ensure their correctness, all of our results and procedures have been formalized and mechanically checked by the ACL2 prover. We present evidence of the value of this approach by comparing it to that of a more conventional published paper that reports similar results, which are shown to be fatally flawed.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 26
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: Multicore chips are currently dominating the microprocessor market as designs that improve performance and sustain power consumption. However, complex core features must be still considered to provide good performance for existing sequential applications. An effective approach to reduce core complexity without dramatically sacrificing performance is to distribute critical processor structures by using clustered microarchitectures. In these designs, communication latency among clusters is a critical performance bottleneck, and a good steering algorithm is required to reduce intercluster communication. In this paper, we propose a new energy-efficient microarchitectural approach that reduces intercluster communication by detecting and generating independent chains of instructions, referred to as subtraces, from the execution of sequential programs. The devised mechanism has been modeled on an x86-based trace-cache processor, where subtraces are built in the fill unit, stored in a trace cache, and individually steered to different clusters. Experimental results show that the proposal reaches performance speedups around 7 and 15 percent for point-to-point and bus-based interconnects, respectively, while achieving energy savings of up to 12 percent.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 27
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: The decimal multiplication is one of the most important decimal arithmetic operations which have a growing demand in the area of commercial, financial, and scientific computing. In this paper, we propose a parallel decimal multiplication algorithm with three components, which are a partial product generation, a partial product reduction, and a final digit-set conversion. First, a redundant number system is applied to recode not only the multiplier, but also multiples of the multiplicand in signed-digit (SD) numbers. Furthermore, we present a multioperand SD addition algorithm to reduce the partial product array. Finally, a digit-set conversion algorithm with a hybrid prefix network to decrease the number of the logic gates on the critical path is discussed. An analysis of the timing delay and an HDL model synthesized under 90 nm technology show that by considering the tradeoff of designs among three components, the overall delay of the proposed $(16times 16)$-digit multiplier takes about 11 percent less timing delay with 2 percent less area compared to the current fastest design.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 28
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: Longest Prefix Matching (LPM), Policy Filtering (PF), and Content Filtering (CF) are three important tasks for Internet nowadays. It is both technologically and economically important to develop integrated solutions to the effective execution of the three tasks. To this end, in this paper, we propose a distributed Ternary Content Addressable Memory (TCAM) coprocessor architecture. The integrated solution exploits the complementary lookup load and storage load requirements of the three tasks to balance the lookup load and storage load among the TCAMs. A prefix filtering-based CF algorithm is designed to reduce the lookup load and a novel cache system is developed to dynamically handle the lookups from overloaded TCAMs. Simulations based on real-world traffic traces show that the proposed solution can perform all three tasks given a 10 Gbps line rate using only the resources required to perform just the CF task given a 10 Gbps line rate.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 29
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: Simple power attack (SPA) is a type of side-channel attack (SCA). In the literature, many SPA-resistant scalar multiplication algorithms have been proposed, but most are inefficient and not interoperable with other coding methods. To prevent SPA, Chevallier-Mames et al. proposed a technique called side-channel atomicity for pure binary number systems. Using their method, extra costs for preventing SPA can be limited. Even though many researchers have extended this technique to other number systems, their algorithms are for specific cases and few provide implementation results. In this paper, we generalize the atomicity technique to protect nearly all existing fast coding methods/number systems. Our general framework provides security and flexibility while its efficiency is coupled to that of the coding methods. Moreover, we utilize our framework to protect the known fastest scalar multiplications by exploring application on the GLV method for GLS curves. Proof of concept programs are written in the C language along with assembly for fast field operations and run on AMD Athlon X2 245-based hardware.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 30
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: System-level diagnosis is a crucial subject for maintaining the reliability of multiprocessor interconnected systems. Consider a system composed of N independent processors, each of which tests a subset of the others. Under the PMC diagnosis model, Dahbura and Masson proposed an O(N^{2.5}) algorithm to identify the set of faulty processors in a t-diagnosable system, in which at most t processors are permanently faulty. In this paper, we establish some sufficient conditions so that a t-regular system can be conditionally (2t-1)-diagnosable, provided every fault-free processor has at least one fault-free neighbor. Because any t-regular system is no more than t-diagnosable, the approached diagnostic capability is nearly double the classical one-step diagnosability. Furthermore, a correct and complete method is given which exploits these conditions and the presented branch-of-tree architecture to determine the fault status of any single processor. The proposed method has time complexity O(t^2), and thus can diagnose the whole system in time O(t^2 N). In short, not only could the diagnostic capability be proved theoretically, but also it is feasible from an algorithmic perspective.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 31
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: Real-time (RT) communication support is a critical requirement for many complex embedded applications which are currently targeted to Network-on-chip (NoC) platforms. In this paper, we present novel methods to efficiently calculate worst case bandwidth and latency bounds for RT traffic streams on wormhole-switched NoCs with arbitrary topology. The proposed methods apply to best-effort NoC architectures, with no extra hardware dedicated to RT traffic support. By applying our methods to several realistic NoC designs, we show substantial improvements (more than 30 percent in bandwidth and 50 percent in latency, on average) in bound tightness with respect to existing approaches.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 32
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: Due to the increasing demand of an extra-low-power system, a great amount of research effort has been spent in the past to develop an effective and economic subthreshold SRAM design. However, the test methods regarding those newly developed subthreshold SRAM designs have not yet been fully discussed. In this paper, we first categorize the subthreshold SRAM designs into three types, study the faulty behavior of open defects and address decoders faults on each type of designs, and then identify the faults which may not be covered by a traditional SRAM test method. We will also discuss the impact of open defects and threshold-voltage mismatch on sense amplifiers under subthreshold operations. A discussion about the temperature at test is also provided.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 33
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: By exploring different granularities of data-level and task-level parallelism, we map 16 implementations of an Advanced Encryption Standard (AES) cipher with both online and offline key expansion on a fine-grained many-core system. The smallest design utilizes only six cores for offline key expansion and eight cores for online key expansion, while the largest requires 107 and 137 cores, respectively. In comparison with published AES cipher implementations on general purpose processors, our design has 3.5-15.6 times higher throughput per unit of chip area and 8.2-18.1 times higher energy efficiency. Moreover, the design shows 2.0 times higher throughput than the TI DSP C6201, and 3.3 times higher throughput per unit of chip area and 2.9 times higher energy efficiency than the GeForce 8800 GTX.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 34
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: Adaptive computing systems rely on accurate predictions of application behavior to understand and respond to the dynamically varying characteristics. In this study, we present a Statistical Metric Model (SMM) that is system- and metric-independent for predicting application behavior. SMM is a probability distribution over application patterns of varying length and it models how likely a specific behavior occurs. Maximum Likelihood Estimation (MLE) criterion is used to estimate the parameters of SMM. The parameters are further refined with a smoothing method to improve prediction robustness. We also propose an extension to SMM (i.e., SMM-Interp) to handle sudden short-term changes in application behavior. SMM learns the application patterns during runtime, and at the same time predicts the upcoming application phases based on what it has learned up to that point. We demonstrate several key features of SMM: 1) adaptation, 2) variable length sequence modeling, and 3) long-term memory. An extensive and rigorous series of prediction experiments show the superior performance of the SMM predictor over existing predictors on a wide range of benchmarks. For some of the benchmarks, SMM reduces the prediction error rate by 10X and 3X, compared to last value and table-based prediction approaches, respectively. SMM's improved prediction accuracy results in superior power-performance tradeoffs when it is applied to an adaptive dynamic power management scheme.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 35
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: As the number of transistors that are integrated onto a silicon die continues to increase, the compute power is becoming a commodity. This has enabled a whole host of new applications that rely on high-throughput computations. Recently, the need for faster and cost-effective applications in form-factor constrained environments has driven an interest in on-chip acceleration of algorithms based on Monte Carlo simulations. Though Field Programmable Gate Arrays (FPGAs), with hundreds of on-chip arithmetic units, show significant promise for accelerating these embarrassingly parallel simulations, a challenge exists in sharing access to simulation data among many concurrent experiments. This paper presents a compute architecture for accelerating Monte Carlo simulations based on the Network-on-Chip (NOC) paradigm for on-chip communication. We demonstrate through the complete implementation of a Monte Carlo-based image reconstruction algorithm for Single-Photon Emission Computed Tomography (SPECT) imaging that this complex problem can be accelerated by two orders of magnitude on even a modestly sized FPGA over a 2 GHz Intel Core 2 Duo Processor. The architecture and the methodology that we present in this paper is modular and hence it is scalable to problem instances of different sizes, with application to other domains that rely on Monte Carlo simulations.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 36
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: In this paper, we study the problem of optimal oblivious routing for 1D and 2D torus networks. We introduce a new closed-form oblivious routing algorithm called W2TURN that is worst-case throughput optimal for 2D torus networks. W2TURN is based on a weighted random selection of paths that contain at most two turns. Restricting the maximum number of turns in routing paths to just two enables a simple deadlock-free implementation of W2TURN. In terms of average hop count, W2TURN outperforms the best previously known closed-form worst-case throughput optimal routing algorithm called IVAL [CHECK END OF SENTENCE]. When the network radix is odd, W2TURN achieves the minimum average hop count that can be achieved with 2-turn paths while remaining worst-case throughput optimal. When the network radix is even, W2TURN comes very close to achieving the minimum average hop count while remaining worst-case throughput optimal, within just 0.72 percent on a 12times 12 torus. We also describe another routing algorithm based on weighted random selection of paths with at most two turns called I2TURN and show that I2TURN is equivalent to IVAL. However, I2TURN eliminates the need for loop removal at runtime and provides a closed-form analytical expression for evaluating the average hop count. The latter enables us to demonstrate analytically that W2TURN strictly outperforms IVAL (and I2TURN) in average hop count. Finally, we present a new optimal weighted random routing algorithm for rings called Weighted Random Direction (WRD). WRD provides a closed-form expression for the optimal distribution of traffic along the minimal and nonminimal directions in a ring topology to achieve minimum average hop count while guaranteeing optimal worst-case throughput. Based on our evaluations, in addition to being worst-case throughput optimal, W2TURN and WRD also perform well in the average case, and outperform the best previously known worst-case throughput optimal routing algorithms with closed-for- descriptions in latency and throughput over a wide range of traffic patterns.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 37
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: The provision of location-based services with high positional accuracy requires the use of Time of Arrival (TOA)-based techniques. However, existing TOA-based WLAN location service schemes are inefficient due to the individual query and response ranging method employed. We present a highly efficient WLAN location service architecture which includes a modification to the Transmit Opportunity (TXOP) technique in the IEEE 802.11e standard. Our Location Service with TXOP (LSOP) scheme achieves high efficiency by minimizing the number of TOA transmissions and eliminating the contention overhead for TOA messages. The adaptation of TXOP technique also improves location accuracy by protecting TOA messages from collision and by grouping the TOA messages into one compact burst. Our analysis shows that the LSOP scheme achieves the highest location update rate compared to previous schemes. Our simulation results show that the LSOP scheme has minimum impact on data traffic and achieves higher accuracy than the previous schemes. Experimental results demonstrate the degradation in localization performance caused by packet collisions. These results validate that our LSOP scheme, which implements contention-free broadcast of TOA messages with a modified TXOP, provides the best combination of high location update rate, low network load, and high location accuracy compared to other schemes.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 38
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: A key aspect in the design of efficient multiprocessor systems is the cache coherence protocol. Although directory-based protocols constitute the most scalable approach, the limited size of the directory caches together with the growing size of systems may cause frequent evictions and, consequently, the invalidation of cached blocks, which jeopardizes system performance. Directory caches keep track of every memory block stored in processor caches in order to provide coherent access to the shared memory. However, a significant fraction of the cached memory blocks do not require coherence maintenance (even in parallel applications) because they are either accessed by just one processor or they are never modified. In this paper, we propose to deactivate the coherence protocol for those blocks that do not require coherence. This deactivation means directory caches do not have to keep track of noncoherent blocks, which reduces directory cache occupancy and increases its effectiveness. Since the detection of noncoherent blocks is carried out by the operating system, our proposal only requires minor hardware modifications. Simulation results show that, thanks to our proposal, directory caches can avoid the tracking of about 66 percent (on average) of the blocks accessed by a wide range of applications, thereby improving the efficiency of directory caches. This contributes either to shortening the runtime of parallel applications by 15 percent (on average) while keeping directory cache size or to maintaining performance while using directory caches 16 times smaller.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 39
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: In this paper, we propose a distributed routing algorithm for vertically partially connected regular 2D topologies of different shapes and sizes (e.g., 2D mesh, torus, ring). The topologies that are the target of this algorithm are of practical interest in the 3D integration of heterogeneous dies using Through-Silicon-Vias (TSVs). Indeed, TSV-based 3D integration allows to envision the stacking of dies with different functions and technologies, using as an interconnect backbone a 3D-NoC. Intrinsically, 3D topologies have better performances, but yield and active area (and thus the cost) are function of the number of TSVs; therefore, the designs tend to use only a subset of available TSVs between two dies. The definition of blockage free and low implementation cost distributed deterministic routing on this kind of topology is thus of theoretical and practical interests. We formally prove that independently of the shape and dimensions of the planar topologies and of the number and placement of the TSVs, the proposed routing algorithm using two virtual channels in the plane is deadlock and livelock free. We also experimentally show that the performance of this algorithm is still acceptable when the number of vertical connections decreases.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 40
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: Performance degradation of integrated circuits due to aging effects, such as Negative Bias Temperature Instability (NBTI), is becoming a great concern for current and future CMOS technology. In this paper, we propose two monitoring and masking approaches that detect late transitions due to NBTI degradation in the combinational part of critical data paths and guarantee the correctness of the provided output data by adapting the clock frequency. Compared to recently proposed alternative solutions, one of our approaches (denoted as Low Area and Power (LAP) approach) requires lower area overhead and lower, or comparable, power consumption, while exhibiting the same impact on system performance, while the other proposed approach (denoted as High Performance (HP) approach) allows us to reduce the impact on system performance, at the cost of some increase in area and power consumption.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 41
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: Existing mechanisms for handover authentication mainly focus on designing a secure authentication module, little attention has been paid to protect users' privacy when they are authenticated by the access points for data access. Further, most existing approaches do not support user revocation. In this paper, we present a secure and efficient authentication protocol named Handauth. Similar to the mechanisms of this field, Handauth provides user authentication and session key establishment. However, compared to other well-known approaches, Handauth not only enjoys both computation and communication efficiency, but also achieves strong user anonymity and untraceablility, forward secure user revocation, conditional privacy-preservation, AAA server anonymity, access service expiration management, access point authentication, easily scheduled revocation, dynamic user revocation and attack resistance. Experimental results show that the proposed approach is feasible for real applications.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 42
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: Quantum-dot Cellular Automata (QCA) technology is a promising potential alternative to CMOS technology. To explore the characteristics of QCA and suitable design methodologies, digital circuit design approaches have been investigated. Due to the inherent wire delay in QCA, pipelined architectures appear to be a particularly suitable design technique. Also, because of the pipeline nature of QCA technology, it is not suitable for a complicated control system design. Systolic arrays take advantage of pipelining, parallelism, and simple local control. Therefore, an investigation into these architectures in semiconductor QCA technology is provided in this paper. Two case studies, (a matrix multiplier and a Galois Field multiplier) are designed and analyzed based on both multilayer and coplanar crossings. The performance of these two types of interconnections are compared and it is found that even though coplanar crossings are currently more practical, they tend to occupy a larger design area and incur slightly more delay. A general semiconductor QCA systolic array design methodology is also proposed. It is found that by applying a systolic array structure in QCA design, significant benefits can be achieved particularly with large systolic arrays, even more so than when applied in CMOS-based technology.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 43
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 44
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: In a multiprocessor system, a limited number of resources need to be uniformly distributed so that all processor nodes can have equal access to these resources. This is referred to as the resource placement problem. In a perfect t--placement each nonresource node is at a distance of t or less from exactly one resource node. Here, we first find all perfect t-placements in the infinite square and triangular grids. That information is then used to show that translates of earlier sets are the only perfect t-placements in Gaussian and EJ interconnection networks.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 45
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: Continuously monitoring link performance is important to network diagnosis. In this paper, we address the problem of minimizing the probing cost and achieving identifiability in probe-based network link monitoring. Given a set of links to monitor, our objective is to select the minimum number of probing paths that can uniquely determine all identifiable links and cover all unidentifiable links. We propose an algorithm based on a linear system model to find out all irreducible sets of probing paths that can uniquely determine an identifiable link, and we extend the bipartite model to reflect the relationship between a set of probing paths and an identifiable link. Since our optimization problem is NP-hard, we propose a heuristic-based algorithm to greedily select probing paths. Our method eliminates two types of redundant probing paths, i.e., those that can be replaced by others and those without any contribution to achieving identifiability. Simulations based on real network topologies show that our approach can achieve identifiability with very low probing cost. Compared with prior work, our method is more general and has better performance.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 46
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Beschreibung: This paper presents a high-performance and biophysically accurate neuroprocessor architecture based on floating point arithmetic and compartmental modeling. It aims to overcome the limitations of traditional hardware neuron models that simplify the required arithmetic using fixed-point models. This can result in arbitrary loss of precision due to rounding errors and data truncation. On the other hand, a neuroprocessor based on a floating-point bio-inspired model, such as the one presented in this work, is able to capture additional cell properties and accurately mimic cellular behaviors required in many neuroscience experiments. The architecture is prototyped in reconfigurable logic obtaining a flexible and adaptable cell and network structure together with real time performance by using the available floating point hardware resources in parallel. The paper also demonstrates model scalability by combining the basic processor components that describe the soma, dendrite and synapse of organic cells to form more complex neuron structures.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 47
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-02-02
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 48
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: The cloud is emerging for scalable and efficient cloud services. To meet the needs of handling massive data and decreasing data migration, the computation infrastructure requires efficient data placement and proper management for cached data. In this paper, we propose an efficient and cost-effective multilevel caching scheme, called MERCURY, as computation infrastructure of the cloud. The idea behind MERCURY is to explore and exploit data similarity and support efficient data placement. To accurately and efficiently capture the data similarity, we leverage a low-complexity locality-sensitive hashing (LSH). In our design, in addition to the problem of space inefficiency, we identify that a conventional LSH scheme also suffers from the problem of homogeneous data placement. To address these two problems, we design a novel multicore-enabled locality-sensitive hashing (MC-LSH) that accurately captures the differentiated similarity across data. The similarity-aware MERCURY, hence, partitions data into the L1 cache, L2 cache, and main memory based on their distinct localities, which help optimize cache utilization and minimize the pollution in the last-level cache. Besides extensive evaluation through simulations, we also implemented MERCURY in a system. Experimental results based on real-world applications and data sets demonstrate the efficiency and efficacy of our proposed schemes.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 49
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 50
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: To provide fault tolerance for cloud storage, recent studies propose to stripe data across multiple cloud vendors. However, if a cloud suffers from a permanent failure and loses all its data, we need to repair the lost data with the help of the other surviving clouds to preserve data redundancy. We present a proxy-based storage system for fault-tolerant multiple-cloud storage called NCCloud, which achieves cost-effective repair for a permanent single-cloud failure. NCCloud is built on top of a network-coding-based storage scheme called the functional minimum-storage regenerating (FMSR) codes, which maintain the same fault tolerance and data redundancy as in traditional erasure codes (e.g., RAID-6), but use less repair traffic and, hence, incur less monetary cost due to data transfer. One key design feature of our FMSR codes is that we relax the encoding requirement of storage nodes during repair, while preserving the benefits of network coding in repair. We implement a proof-of-concept prototype of NCCloud and deploy it atop both local and commercial clouds. We validate that FMSR codes provide significant monetary cost savings in repair over RAID-6 codes, while having comparable response time performance in normal cloud storage operations such as upload/download.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 51
    Publikationsdatum: 2013-12-07
    Beschreibung: For multiple heterogeneous multicore server processors across clouds and data centers, the aggregated performance of the cloud of clouds can be optimized by load distribution and balancing. Energy efficiency is one of the most important issues for large-scale server systems in current and future data centers. The multicore processor technology provides new levels of performance and energy efficiency. The present paper aims to develop power and performance constrained load distribution methods for cloud computing in current and future large-scale data centers. In particular, we address the problem of optimal power allocation and load distribution for multiple heterogeneous multicore server processors across clouds and data centers. Our strategy is to formulate optimal power allocation and load distribution for multiple servers in a cloud of clouds as optimization problems, i.e., power constrained performance optimization and performance constrained power optimization. Our research problems in large-scale data centers are well-defined multivariable optimization problems, which explore the power-performance tradeoff by fixing one factor and minimizing the other, from the perspective of optimal load distribution. It is clear that such power and performance optimization is important for a cloud computing provider to efficiently utilize all the available resources. We model a multicore server processor as a queuing system with multiple servers. Our optimization problems are solved for two different models of core speed, where one model assumes that a core runs at zero speed when it is idle, and the other model assumes that a core runs at a constant speed. Our results in this paper provide new theoretical insights into power management and performance optimization in data centers.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 52
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: Efficiently analyzing big data is a major issue in our current era. Examples of analysis tasks include identification or detection of global weather patterns, economic changes, social phenomena, or epidemics. The cloud computing paradigm along with software tools such as implementations of the popular MapReduce framework offer a response to the problem by distributing computations among large sets of nodes. In many scenarios, input data are, however, geographically distributed (geodistributed) across data centers, and straightforwardly moving all data to a single data center before processing it can be prohibitively expensive. Above-mentioned tools are designed to work within a single cluster or data center and perform poorly or not at all when deployed across data centers. This paper deals with executing sequences of MapReduce jobs on geo-distributed data sets. We analyze possible ways of executing such jobs, and propose data transformation graphs that can be used to determine schedules for job sequences which are optimized either with respect to execution time or monetary cost. We introduce G-MR, a system for executing such job sequences, which implements our optimization framework. We present empirical evidence in Amazon EC2 and VICCI of the benefits of G-MR over common, naïve deployments for processing geodistributed data sets. Our evaluations show that using G-MR significantly improves processing time and cost for geodistributed data sets.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 53
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 54
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: Iran is a country in a dry part of the world and extensively suffers from drought. Drought is a natural, temporary, and iterative phenomenon that is caused by shortage in rainfall, which affects people's health and well-being adversely as well as impacting the society's economy and politics with far-reaching consequences. Information on intensity, duration, and spatial coverage of drought can help decision makers to reduce the vulnerability of the drought-affected areas, and therefore, lessen the risks associated with drought episodes. One of the major challenges of modeling drought (and short-term forecasting) in Iran is unavailability of long-term meteorological data for many parts of the country. Satellite-based remote sensing dataâthat are freely availableâgive information on vegetation conditions and land cover. In this paper, we constructed artificial neural network to model (and forecast) drought conditions based on satellite imagery. To this end, standardized precipitation index (SPI) was used as a measure of drought severity. A number of features including normalized difference vegetation index (NDVI), vegetation condition index (VCI), and temperature condition index (TCI) were extracted from NOAA-AVHRR images. The model received these features as input and outputted the SPI value (or drought condition). Applying the model to the data of stations for which the precipitation data were available, we showed that it could forecast the drought condition with an accuracy of up to 90 percent. Furthermore, TCI was found to be the best marker of drought conditions among satellite-based features. We also found multilayer perceptron better than radial basis function networks and support vector machines forecasting drought conditions.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 55
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: We consider an integrated biomass inventory and energy production problem that arises from scenario analysis in national energy planning. It addresses together two decision stages that have previously been kept distinct: the decisions on the purchase of biomass and the decisions on the production of electricity and heat of each power plant. We model the problem as a stochastic mixed 0-1 integer linear programming problem. Since practical instances of this problem are very large, we experimentally assess a relaxed formulation to obtain near-optimal solutions. In addition, we study a Benders decomposition that exploits the problem structure of the relaxed formulation. In a distributed computing architecture this decomposition allows to increase the number of included scenarios and thus to better address the uncertainty of the data. Computational results indicate that at parity of information the approximation provided by the relaxed model is good. However, by allowing to increase the amount of information treated it can provide more accurate predictions. On a multicore computing architecture a state-of-the-art MIP solver operating on the undecomposed model is sufficient to achieve similar performance as the Benders decomposition. However, the use of the solver in a distributed computing environment is not obvious and the Benders decomposition is a more easily implementable and scalable approach.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 56
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: Sensing and monitoring of our natural environment are important for sustainability. As sensor systems grow to a large scale, it will become infeasible to place all sensors under centralized control. We investigate community sensing, where sensors are controlled by self-interested agents that report their measurements to a center. The center can control the agents only through incentives that motivate them to provide the most accurate and useful reports. We consider different game-theoretic mechanisms that provide such incentives and analyze their properties. As an example, we consider an application of community sensing for monitoring air pollution.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 57
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: The drive toward sustainable wastewater management is challenging the conventional paradigm of linear end-of-pipe solutions. A shift toward more sustainable solutions requires that information about new ideas, systems, and technologies be more readily accessible for addressing wastewater problems. It is commonly argued that decision-making needs to involve engineers and other community representatives to define values and brainstorm solutions. This paper describes a decision support system (DSS) prototype that is designed to help community planners identify solutions which balance environmental, economic, and social goals. The system is designed to be scalable, adaptable, and flexible to allow fair assessment of new ideas and technologies. It supports the exploration of consequences of various alternatives and visualizes the tradeoffs between them. Our DSS takes in modular descriptions of components and a description of a community context, automates the design of alternative wastewater systems, and facilitates evaluating how well each design satisfies the given context. It provides an adaptable platform from which new solutions can be designed without having to predefine how a single component fits within a specific system. Our DSS facilitates the exploration of alternative solutions by visualizing the effect of various tradeoffs and their consequences in relation to the community's sustainability goals.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 58
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: Spatiotemporal planning involves making choices at multiple locations in space over some planning horizon to maximize utility and satisfy various constraints. In Forest Ecosystem Management, the problem is to choose actions for thousands of locations each year including harvesting, treating trees for fire or pests, or doing nothing. The utility models could place value on sale of lumber, ecosystem sustainability or employment levels and incorporate legal and logistical constraints on actions such as avoiding large contiguous areas of clearcutting. Simulators developed by forestry researchers provide detailed dynamics but are generally inaccesible black boxes. We model spatiotemporal planning as a factored Markov decision process and present a policy gradient planning algorithm to optimize a stochastic spatial policy using simulated dynamics. It is common in environmental and resource planning to have actions at different locations be spatially interelated; this makes representation and planning challenging. We define a global spatial policy in terms of interacting local policies defining distributions over actions at each location conditioned on actions at nearby locations. Markov chain Monte Carlo simulation is used to sample landscape policies and estimate their gradients. Evaluation is carried out on a forestry planning problem with 1,880 locations using a variety of value models and constraints.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 59
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: We consider a dynamic Nash game among firms harvesting a renewable resource (e.g., in a fishery) and propose a differential variational inequality (DVI) framework for modeling and solving such a game. We suppose the firms compete over demand as well as over regulated harvest effort that we interpret as a sustainability constraint on the fleet's aggregate harvest effort. We suppose each firm is based in a home market that is not protected by trade barriers implying that each firm can sell its catch in any of the markets. Within this setting, we consider how harvest effort, catch, and sustainability of the resource are affected by the length of the planning horizon of the firms. We show results that contrast myopic planning versus long-term perspectives. To derive solutions for this game, we propose a DVI framework that is converted to a fixed-point problem. This allows us to employ a computationally efficient algorithm for the solution of the game.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 60
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: The focus of this work is calibration of the land-use module of an integrated land-use and transportation model (ILUTM). The calibration task involves estimating key parameters that dictate the output of the land-use module. Hence, an algorithm based on maximum-likelihood estimation (MLE) is developed for calibration. Furthermore, the observed values of the outputs from the land-use module are assumed to admit a Gaussian error. The ILUTM methodology used here is TRANUS which is used to model the city of Grenoble in France. The aforementioned algorithm is then applied to calibrate the land-use module of the Grenoble model. The covariance of the Gaussian error term is assumed to be unknown. It is represented as a function of the land-use module inputs and "hyperparameters.â' The resulting MLE optimization problem has 111 parameters to be estimated, 90 of which are land use parameters and 21 are hyperparameters of the Gaussian covariance kernel. The performance of the proposed calibration methodology is then compared to the traditional calibration techniques used for land use and transportation models, when applied to the Grenoble land-use model. It is observed that the proposed method outperforms the traditional technique when compared based upon a given quantity of interest.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 61
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: Safety-critical distributed real-time applications operating with strict temporal constraints rely on deterministic networks with low latency and jitter. Traditional fieldbus systems deliver these guarantees, but they have limited compatibility with open infrastructures and limited support for high transmission rates. Ethernet technology rises as a low-cost, high-speed, and ubiquitous alternative to fieldbus systems; however, standard Ethernet requires special arbitration mechanisms to support real-time traffic because of the standard's inherent nondeterministic behavior. This work explores the associated tradeoffs for three different solutions for real-time communication over switched Ethernet. The paper presents and discusses three architectures that modify different network components, enhancing them with additional customized modules to support time-triggered communication based on Network Code. Using the NetFPGA platform as the unified prototyping technology for all the components, we developed an open-source framework to characterize each solution using experimental data for the latency, jitter, throughput, robustness, and cost in logical resources. The results provide insights to help future developers of real-time communication technology decide which components to modify according to the requirements of their applications.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 62
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: We present a cloud resource procurement approach which not only automates the selection of an appropriate cloud vendor but also implements dynamic pricing. Three possible mechanisms are suggested for cloud resource procurement: cloud-dominant strategy incentive compatible (C-DSIC), cloud-Bayesian incentive compatible (C-BIC), and cloud optimal (C-OPT). C-DSIC is dominant strategy incentive compatible, based on the VCG mechanism, and is a low-bid Vickrey auction. C-BIC is Bayesian incentive compatible, which achieves budget balance. C-BIC does not satisfy individual rationality. In C-DSIC and C-BIC, the cloud vendor who charges the lowest cost per unit QoS is declared the winner. In C-OPT, the cloud vendor with the least virtual cost is declared the winner. C-OPT overcomes the limitations of both C-DSIC and C-BIC. C-OPT is not only Bayesian incentive compatible, but also individually rational. Our experiments indicate that the resource procurement cost decreases with increase in number of cloud vendors irrespective of the mechanisms. We also propose a procurement module for a cloud broker which can implement C-DSIC, C-BIC, or C--OPT to perform resource procurement in a cloud computing context. A cloud broker with such a procurement module enables users to automate the choice of a cloud vendor among many with diverse offerings, and is also an essential first step toward implementing dynamic pricing in the cloud.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 63
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: As an increasing number of infrastructure-as-a-service (IaaS) cloud providers start to provide cloud computing services, they form a competition market to compete for users of these services. Due to different resource capacities and service workloads, users may observe different finishing times for their cloud computing tasks and experience different levels of service qualities as a result. To compete for cloud users, it is critically important for each cloud service provider to select an "optimalâ' price that best corresponds to their service qualities, yet remaining attractive to cloud users. To achieve this goal, the underlying rationale and characteristics in this competition market need to be better understood. In this paper, we present an in-depth game theoretic study of such a competition market with multiple competing IaaS cloud providers. We characterize the nature of noncooperative competition in an IaaS cloud market, with a goal of capturing how each IaaS cloud provider will select its optimal prices to compete with the others. Our analyses lead to sufficient conditions for the existence of a Nash equilibrium, and we characterize the equilibrium analytically in special cases. Based on our analyses, we propose iterative algorithms for IaaS cloud providers to compute equilibrium prices, which converge quickly in our study.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 64
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: We consider Internet-based master-worker computations, where a master processor assigns, across the Internet, a computational task to a set of untrusted worker processors, and collects their responses. Examples of such computations are the "@homeâ' projects such as SETI. In this work, various worker behaviors are considered. Altruistic workers always return the correct result of the task, malicious workers always return an incorrect result, and rational workers act based on their self-interest. In a massive computation platform, such as the Internet, it is expected that all three type of workers coexist. Therefore, in this work, we study Internet-based master-worker computations in the presence of malicious, altruistic, and rational workers. A stochastic distribution of the workers over the three types is assumed. In addition, we consider the possibility that the communication between the master and the workers is not reliable, and that workers could be unavailable. Considering all the three types of workers renders a combination of game-theoretic and classical distributed computing approaches to the design of mechanisms for reliable Internet-based computing. Indeed, in this work, we design and analyze two algorithmic mechanisms to provide appropriate incentives to rational workers to act correctly, despite the malicious workers' actions and the unreliability of the communication. Only when necessary, the incentives are used to force the rational players to a certain equilibrium (which forces the workers to be truthful) that overcomes the attempt of the malicious workers to deceive the master. Finally, the mechanisms are analyzed in two realistic Internet-based master-worker settings, a SETI-like one and a contractor-based one, such as Amazon's mechanical turk. We also present plots that illustrate the tradeoffs between reliability and cost, under different system parameters.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 65
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: We address several algorithms to perform a double-scalar multiplication on an elliptic curve. All the methods investigated are related to the double-base number system (DBNS) and extend previous work of Doche et al. . We refine and rigorously prove the complexity analysis of the joint binary-ternary (JBT) algorithm. Experiments are in line with the theory and show that the JBT requires approximately 6 percent less field multiplications than the standard joint sparse form (JSF) method to compute $([{schmi{n}}]{schmi{P}} + [{schmi{m}}]{schmi{Q}})$. We also introduce a randomized version of the JBT, called JBT-Rand, that gives total control of the number of triplings in the expansion that is produced. So it becomes possible with the JBT-Rand to adapt and tune the number of triplings to the coordinate system and bit length that are used, to further decrease the cost of a double-scalar multiplication. Then, we focus on Koblitz curves. For extension degrees enjoying an optimal normal basis of type II, we discuss a Joint $({schmi{tau}})$-DBNS approach that reduces the number of field multiplications by at least 35 percent over the traditional $({schmi{tau}})$-JSF. For other extension degrees represented in polynomial basis, the Joint $({schmi{tau}})$-DBNS is still relevant provided that appropriate bases conversion methods are used. In this situation, tests show that the speedup over the $({schmi{tau}})$-JSF is then larger than 20 percent. Finally, when the use of the $({schmi{tau}})$-DBNS becomes unrealistic, for instance because of the lack of an efficient normal basis or the lack of memory to allow an efficient conversion, we adapt the joint binary-ternary algorithm to Koblitz curves giving rise to the Joint $({schmi{tau}})$-$(bar{{schmi{tau}} })$ method whose complexity is analyzed and proved. The Joint $({schmi{tau}})$-$(bar{{schmi{tau}} })$ induces a speedup of about 10 percent over the $({schmi{tau}})$-JSF.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 66
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: In cryptography, secure channels enable the confidential and authenticated message exchange between authorized users. A generic approach of constructing such channels is by combining an encryption primitive with an authentication primitive (MAC). In this work, we introduce the design of a new cryptographic primitive to be used in the construction of secure channels. Instead of using general purpose MACs, we propose the deployment of special purpose MACs, named $({cal E})$-MACs. The main motivation behind this work is the observation that, since the message must be both encrypted and authenticated, there might be some redundancy in the computations performed by the two primitives. Therefore, removing such redundancy can improve the efficiency of the overall composition. Moreover, computations performed by the encryption algorithm can be further utilized to improve the security of the authentication algorithm. In particular, we will show how $({cal E})$-MACs can be designed to reduce the amount of computation required by standard MACs based on universal hash functions, and show how $({cal E})$-MACs can be secured against key-recovery attacks.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 67
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: Real number block codes derived from the discrete Fourier transform (DFT) are corrected by coupling a very modified Berlekamp-Massey (BM) algorithm with a syndrome extension process. The modified BM algorithm determines recursively the locations of any large errors whose number is within the capability of the DFT code. It evolves a connection polynomial which is changed when a discrepancy is above a threshold which is adjusted during each iteration. The large error locations are repositioned to exact location indices to combat low-level noise effects. The syndromes are extended using the refined connection polynomial taps. Alternately, enhanced extension recursions based on Kalman syndrome extensions are developed and simulated.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 68
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: As throughput, scalability, and energy efficiency in network-on-chips (NoCs) are becoming critical, there is a growing impetus to explore emerging technologies for implementing NoCs in future multicore and many-core architectures. Two disruptive technologies on the horizon are nanophotonic interconnects (NIs) and 3D stacking. NIs can deliver high on-chip bandwidth while delivering low energy/bit, thereby providing a reasonable performance-per-watt in the future. Three-dimensional stacking can reduce the interconnect distance and increase the bandwidth density by incorporating multiple communication layers. In this paper, we propose an architecture that combines NIs and 3D stacking to design an energy-efficient and reconfigurable NoC. We quantitatively compare the hardware complexity of the proposed topology to other nanophotonic networks in terms of hop count, network diameter, radix, and photonic parameters. To maximize performance, we also propose an efficient reconfiguration algorithm that dynamically reallocates channel bandwidth by adapting to traffic fluctuations. For 64-core reconfigured network, our simulation results indicate that the execution time can be reduced up to 25 percent for Splash-2, PARSEC, and SPEC CPU2006 benchmarks. Moreover, for a 256--core version of the proposed architecture, our simulation results indicate a throughput improvement of more than 25 percent and energy savings of 23 percent on synthetic traffic when compared to competitive on-chip electrical and optical networks.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 69
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-07
    Beschreibung: Llists the reviewers who contributed to IEEE Transactions on Computers in 2013.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 70
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-12-14
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 71
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-07-27
    Beschreibung: Scaling of CMOS technology into nanometric feature sizes has raised concerns for the reliable operation of logic circuits, such as in the presence of soft errors. This paper deals with the analysis of the operation of sequential circuits. As the feedback signals in a sequential circuit can be logically masked by specific combinations of primary inputs, the cumulative effects of soft errors can be eliminated. This phenomenon, referred to as error masking, is related to the presence of so-called restoring inputs and/or the consecutive presence of specific inputs in multiple clock cycles (equivalent to a synchronizing sequence in switching theory). In this paper, error masking is extensively analyzed using the operations of state transition matrices (STMs) and binary decision diagrams (BDDs) of a finite state machine (FSM) model. The characteristics of state transitions with respect to correlations between the restoring inputs and time sequence are mathematically established using STMs; although the applicability of the STM analysis is restricted due to its complexity, the BDD approach is more efficient and scalable for use in the analysis of large circuits. These results are supported by simulations of benchmark circuits and may provide a basis for further devising efficient and robust implementations when designing FSMs.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 72
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-07-27
    Beschreibung: This paper introduces the concept of monitoring-as-a-service (MaaS), its main components, and a suite of key functional requirements of MaaS in cloud. We argue that MaaS should support not only the conventional state monitoring capabilities, such as instantaneous violation detection, periodical state monitoring, and single tenant monitoring, but also performance-enhanced functionalities that can optimize on monitoring cost, scalability, and the effectiveness of monitoring service consolidation and isolation. In this paper, we present three enhanced MaaS capabilities and show that window-based state monitoring is not only more resilient to noises and outliers, but also saves considerable communication cost. Similarly, violation-likelihood-based state monitoring can dynamically adjust monitoring intensity based on the likelihood of detecting important events, leading to significant gain in monitoring service consolidation. Finally, multitenancy support in state monitoring allows multiple cloud users to enjoy MaaS with improved performance and efficiency at more affordable cost. We perform extensive experiments in an emulated cloud environment with real-world system and network traces. The experimental results suggest that our MaaS framework achieves significant lower monitoring cost, higher scalability, and better multitenancy performance.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 73
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-07-27
    Beschreibung: An algorithm and architecture for powering computation and root extraction, with fixed-point and floating-point exponents, is presented in this paper. The algorithm is based on an optimized iterative sequence of parallel and/or overlapped operations: 1) reciprocal, 2) high-radix digit-recurrence logarithm, 3) left-to-right carry-free multiplication, and 4) high-radix online exponential. A redundant number system is used to allow for the overlapping of the different operations of the algorithm. As the logarithm and exponential are part of the sequence of operations, some minor changes are made to allow for the independent computation of the logarithm and exponential functions. A sequential implementation of the algorithm is proposed and the execution times and hardware requirements are estimated for single and double-precision floating-point computations. These estimates are obtained for several radices, according to an approximate model for the delay and area of the main logic blocks, and help to determine the radix values, which lead to the most efficient implementations.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 74
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-07-27
    Beschreibung: For a given undirected (edge) weighted graph $(G=(V,E))$, a terminal set $(S subseteq V)$ and a root $(r in S)$, the rooted $(k)$-vertex connected minimum Steiner network ($(kVSMN_{r})$) problem requires to construct a minimum-cost subgraph of $(G)$ such that each terminal in $(Ssetminus{{r}})$ is $(k)$-vertex connected to $(r)$. As an important problem in survivable network design, the $(kVSMN_{r})$ problem is known to be NP-hard even when $(k=1)$. For $(k=3)$ this paper presents a simple combinatorial eight-approximation algorithm, improving the known best ratio 14 of Nutov. Our algorithm constructs an approximate $(3VSMN_{r})$ through augmenting a two-vertex connected counterpart with additional edges of bounded cost to the optimal. We prove that the total cost of the added edges is at most six times of the optimal by showing that the edges in a $(3VSMN_{r})$ compose a subgraph containing our solution in such a way that each edge appears in the subgraph at most six times.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 75
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-07-27
    Beschreibung: Addition is a fundamental function in arithmetic operation; several adder designs have been proposed for implementations in inexact computing. These adders show different operational profiles; some of them are approximate in nature while others rely on probabilistic features of nanoscale circuits. However, there has been a lack of appropriate metrics to evaluate the efficacy of various inexact designs. In this paper, new metrics are proposed for evaluating the reliability as well as the power efficiency of approximate and probabilistic adders. Reliability is analyzed using the so-called sequential probability transition matrices (SPTMs). Error distance (ED) is initially defined as the arithmetic distance between an erroneous output and the correct output for a given input. The mean error distance (MED) and normalized error distance (NED) are then proposed as unified figures that consider the averaging effect of multiple inputs and the normalization of multiple-bit adders. It is shown that the MED is an effective metric for measuring the implementation accuracy of a multiple-bit adder and that the NED is a nearly invariant metric independent of the size of an adder. The MED is, therefore, useful in assessing the effectiveness of an approximate or probabilistic adder implementation, while the NED is useful in characterizing the reliability of a specific design. Since inexact adders are often used for saving power, the product of power and NED is further utilized for evaluating the tradeoffs between power consumption and precision. Although illustrated using adders, the proposed metrics are potentially useful in assessing other arithmetic circuit designs for applications of inexact computing.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 76
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-07-03
    Beschreibung: The trend toward integrated many-core architectures makes the network-on-chip (NoC) technology, the on-chip communication infrastructure of choice. However, and as opposed to a simple bus, due to its distributed and complex nature in terms of topology, wire size, routing algorithm, and so on, the timing behavior and thus performance of the infrastructure is difficult to predict. Therefore, one of the important phases in the NoC design flow is performance evaluation, which is to extract performance metrics to verify whether a specific instance from the NoC design space satisfies the requirements of the entire system. In this sense, reducing the time to obtain the NoC performance and consequently speeding-up the design space exploration is one of the keys that can considerably reduce the design-flow time and cost. In an effort toward this direction, we propose in this paper a novel analytical performance evaluation method that can be used in the earliest stages of the design flow, before using time-consuming simulations. The analytical method is used to evaluate the performance of a general purpose NoC and we show that it can predict the router latency, end-to-end per-flow latency, and network saturation point with an accuracy comparable to a cycle-accurate simulation. To systematically analyze the accuracy of our method compared to the corresponding simulation model, we present also an innovative accuracy analysis method.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 77
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: In the design of time-critical applications, schedulability analysis is used to define the feasibility region of tasks with deadlines, so that optimization techniques can find the best design solution within the timing constraints. The formulation of the feasibility region based on the response time calculation requires many integer variables and is too complex for solvers. Approximation techniques have been used to define a convex subset of the feasibility region, used in conjunction with a branch and bound approach to compute suboptimal solutions for optimal task period selection, priority assignment, or placement of tasks onto CPUs. In this paper, we provide an improved and simpler real-time schedulability test that allows an exact and efficient definition of the feasibility region in Mixed Integer Linear Programming (MILP) optimization. Our method requires a significantly smaller number of binary variables and is viable for the treatment of industrial-size problem, as shown by the experiments.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 78
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: Peers participating in a distributed hash table (DHT) may host different numbers of virtual servers and are enabled to balance their loads in the reallocation of virtual servers. Most decentralized load balance algorithms designed for DHTs based on virtual servers require the participating peers to be asymmetric, where some serve as the rendezvous nodes to pair virtual servers and participating peers, thereby introducing another load imbalance problem. While state-of-the-art studies intend to present symmetric load balancing algorithms, they introduce significant algorithmic overheads and guarantee no rigorous performance metrics. In this paper, a novel symmetric load balancing algorithm for DHTs is presented by having the participating peers approximate the system state with histograms and cooperatively implement a global index. Each peer independently reallocates in our proposal its locally hosted virtual servers by publishing and inquiring the global index based on their histograms. Unlike competitive algorithms, our proposal exhibits analytical performance guarantees in terms of the load balance factor and the algorithmic convergence rate, and introduces no load imbalance problem due to the algorithmic workload. Through computer simulations, we show that our proposal clearly outperforms existing distributed algorithms in terms of load balance factor with a comparable movement cost.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 79
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: Stereo Vision, a technique aimed at inferring depth information from stereo images, has been used in a wide range of computer vision applications, with real-time requirements in emerging embedded vision systems. Computation of the disparity map, a vital step in extracting depth information from stereo images, requires a significant amount of computational resources. As such, existing software implementations require high-end hardware platforms to achieve real-time frame rates, suggesting that dedicated hardware mechanisms might be more suitable for embedded applications. In this paper, we present a disparity map computation architecture targeting embedded stereo vision applications with hard real-time requirements. The architecture integrates a hardware edge detection mechanism that reduces the search space, improving the overall performance, and is configurable in terms of various application parameters, making it suitable for a number of application environments. The paper also presents a study on the impact of the various parameters in terms of the performance and hardware/power overheads. An experimental prototype of the architecture was implemented on the Xilinx ML505 FPGA Evaluation Platform, achieving 50 Frames Per Second (fps) for 1,280 × 1,024 image sizes. Moreover, the quality of the disparity maps generated by the proposed system is comparable to other existing hardware implementations featuring local stereo correspondence methods.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 80
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: RC4 is the most popular stream cipher in the domain of cryptology. In this paper, we present a systematic study of the hardware implementation of RC4, and propose the fastest known architecture for the cipher. We combine the ideas of hardware pipeline and loop unrolling to design an architecture that produces 2 RC4 keystream bytes per clock cycle. We have optimized and implemented our proposed design using VHDL description, synthesized with 130, 90, and 65 nm fabrication technologies at clock frequencies 625 MHz, 1.37 GHz, and 1.92 GHz, respectively, to obtain a final RC4 keystream throughput of 10, 21.92, and 30.72 Gbps in the respective technologies.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 81
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: The extensive rise in the number of resource constrained wireless devices and the needs for secure communications with the servers imply fast and efficient cryptographic computations for both parties. Efficient hardware implementation of arithmetic operations over finite field using Gaussian normal basis is attractive for public key cryptography as it provides free squarings. In this paper, we first present two low-complexity digit-level multiplier architectures. It is shown that the proposed multipliers outperform the existing Gaussian normal basis (GNB) multiplier structures available in the literature. Then, for the first time, using these two architectures, we propose a new digit-level hybrid multiplier which performs two successive multiplications with the same latency as the one for one multiplication. We have studied the efficiency of the proposed hybrid architecture in terms of area and time delay for different digit sizes. The main advantage of this new hybrid architecture is to speed up exponentiation and point multiplication whenever double-multiplication is required and the traditional schemes fail due to the data dependencies. We have investigated the applicability of the proposed hybrid structure to reduce the latency of exponentiation-based cryptosystems. Our analysis and timing results show that the expected acceleration in double-exponentiation is considerable. Prototypes of the presented low-complexity multiplier architectures and the proposed hybrid architecture are implemented and experimental results are presented.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 82
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: Images are often corrupted by impulse noise in the procedures of image acquisition and transmission. In this paper, we propose an efficient denoising scheme and its VLSI architecture for the removal of random-valued impulse noise. To achieve the goal of low cost, a low-complexity VLSI architecture is proposed. We employ a decision-tree-based impulse noise detector to detect the noisy pixels, and an edge-preserving filter to reconstruct the intensity values of noisy pixels. Furthermore, an adaptive technology is used to enhance the effects of removal of impulse noise. Our extensive experimental results demonstrate that the proposed technique can obtain better performances in terms of both quantitative evaluation and visual quality than the previous lower complexity methods. Moreover, the performance can be comparable to the higher complexity methods. The VLSI architecture of our design yields a processing rate of about 200 MHz by using TSMC 0.18 μm technology. Compared with the state-of-the-art techniques, this work can reduce memory storage by more than 99 percent. The design requires only low computational complexity and two line memory buffers. Its hardware cost is low and suitable to be applied to many real-time applications.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 83
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: With the popularity of wireless devices and the increase of computing and storage resources, there are increasing interests in supporting mobile computing techniques. Particularly, ad hoc networks can potentially connect different wireless devices to enable more powerful wireless applications and mobile computing capabilities. To meet the ever increasing communication need, it is important to improve the network throughput while guaranteeing transmission reliability. Multiple-input-multiple-output (MIMO) technology can provide significantly higher data rate in ad hoc networks where nodes are equipped with multiantenna arrays. Although MIMO technique itself can support diversity transmission when channel condition degrades, the use of diversity transmission often compromises the multiplexing gain and is also not enough to deal with extremely weak channel. Instead, in this work, we exploit the use of cooperative relay transmission (which is often used in a single antenna environment to improve reliability) in a MIMO-based ad hoc network to cope with harsh channel condition. We design both centralized and distributed scheduling algorithms to support adaptive use of cooperative relay transmission when the direct transmission cannot be successfully performed. Our algorithm effectively exploits the cooperative multiplexing gain and cooperative diversity gain to achieve higher data rate and higher reliability under various channel conditions. Our scheduling scheme can efficiently invoke relay transmission without introducing significant signaling overhead as conventional relay schemes, and seamlessly integrate relay transmission with multiplexed MIMO transmission. We also design a MAC protocol to implement the distributed algorithm. Our performance results demonstrate that the use of cooperative relay in a MIMO framework could bring in a significant throughput improvement in all the scenarios studied, with the variation of node density, link failure ratio, packet arrival - ate, and retransmission threshold.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 84
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: As NAND flash memory is gaining popularity as a storage medium for mobile embedded devices, many flash-aware file systems, flash-aware DBMSes, and flash translation layers (FTLs) require an flash-efficient index structure. This paper proposes a novel index structure called $(mu^{ast })$-Tree which natively works on NAND flash memory, aiming at improving performance over $({rm B}^{+})$-Tree. $(mu^{ast })$--Tree stores all the nodes along the path from the root to the leaf into a single flash memory page in order to minimize the number of flash write operation when a node is updated. Furthermore, $(mu^{ast })$-Tree has an adaptive page layout scheme which dynamically adjusts the page layout according to the workload characteristics on-the-fly. $(mu^{ast })$-Tree also allows flash pages with different page layouts to coexist in the same tree. Our evaluation results with real workload traces show that $(mu^{ast })$-Tree outperforms $({rm B}^{+})$-Tree by up to 55 percent in terms of the time needed for flash operations. With a small in-memory cache of 32 KB, $(mu^{ast })$-Tree improves the overall performance by up to five times compared to $({rm B}^{+})$-Tree with the same cache size.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 85
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: Large scale dense Wireless Sensor Networks (WSNs) will be increasingly deployed in different classes of applications for accurate monitoring. Due to the high density of nodes in these networks, it is likely that redundant data will be detected by nearby nodes when sensing an event. Since energy conservation is a key issue in WSNs, data fusion and aggregation should be exploited in order to save energy. In this case, redundant data can be aggregated at intermediate nodes reducing the size and number of exchanged messages and, thus, decreasing communication costs and energy consumption. In this work, we propose a novel Data Routing for In-Network Aggregation, called DRINA, that has some key aspects such as a reduced number of messages for setting up a routing tree, maximized number of overlapping routes, high aggregation rate, and reliable data aggregation and transmission. The proposed DRINA algorithm was extensively compared to two other known solutions: the Information Fusion-based Role Assignment (InFRA) and Shortest Path Tree (SPT) algorithms. Our results indicate clearly that the routing tree built by DRINA provides the best aggregation quality when compared to these other algorithms. The obtained results show that our proposed solution outperforms these solutions in different scenarios and in different key aspects required by WSNs.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 86
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: In the early design stage of processors, Dynamic Thermal Management (DTM) schemes should be evaluated to avoid excessively high temperature, while minimizing performance overhead. In this paper, we show that conventional thermal simulations using the fixed ambient temperature may lead to the wrong conclusions in terms of temperature, performance, reliability, and leakage power. Though ambient temperature converges to a steady-state value after hundreds of seconds when we run SPEC CPU2000 benchmark suite, the steady-state ambient temperature is significantly different depending on applications and system configuration. To overcome inaccuracy of conventional thermal simulations, we propose that microarchitectural thermal simulations should exploit application/system-dependent ambient temperature. Our evaluation results reveal that performance, thermal behavior, reliability, and leakage power of the same DTM scheme are different when we use the application/system-dependent ambient temperature instead of the fixed ambient temperature. For accurate simulation results, future microarchitectural thermal researchers are expected to evaluate their proposed DTM schemes based on application/system-dependent ambient temperature.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 87
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: Cycle-efficient implementation of the linear feedback shift register (LFSR) algorithm on a word-based microarchitecture is investigated. This work examines an algorithm transformation method, called term-preserving look-ahead transformation (TePLAT), that transforms the bit-serial LFSR algorithm into a bit parallel format while maintaining the overhead of the original LFSR algorithm. Detailed implementation methodologies as well as extensive simulation results are presented. We apply TePLAT to 25 commonly used LFSRs and test the resulting parallel formulations on two popular word-based microprocessor development platforms: a Texas Instrument C6416 Code Composition Simulator and an ARM-9 Simulator. In all 25 cases, TePLAT transformed LFSR formulations consistently achieve much higher throughput than those of a naïve implementation and a traditional look-ahead transformation-based implementation.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 88
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: Let $(A_n)$ be the alternating group of degree $(n)$ with $(nge 3)$. Set $(S={(1 2 i), (1 i 2) vert 3le ile n})$. The alternating group graph, denoted by $(AG_n)$, is defined as the Cayley graph on $(A_n)$ with respect to $(S)$. Jwo et al. [Networks 23 (1993) 315-326] introduced alternating group graph $(AG_n)$ as an interconnection network topology for computing systems. Conditional diagnosability, a new measure of diagnosability introduced by Lai et al. [IEEE Transactions on Computers 54(2) (2005) 165-175] can better measure the diagnosability of regular interconnection networks. This paper determines that under PMC-model the conditional diagnosability of $(AG_n)$ is 4 for $(n=4)$ and $(6n-18)$ for each $(nge 5)$.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 89
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: While several hardware mechanisms have been proposed to control the interaction between hardware threads in an SMT processor, few have addressed the issue of software-controllable SMT performance. The IBM POWER5 and POWER6 are the first high-performance processors implementing a software-controllable hardware-thread prioritization mechanism that controls the rate at which each hardware-thread decodes instructions. This paper shows the potential of this basic mechanism to improve several target metrics for various applications on POWER5 and POWER6 processors. Our results show that although the software interface is exactly the same, the software-controlled priority mechanism has a different effect on POWER5 and POWER6. For instance, hardware threads in POWER6 are less sensitive to priorities than in POWER5 due to the in order design. We study the SMT thread malleability to enable user-level optimizations that leverage software-controlled thread priorities. We also show how to achieve various system objectives such as parallel application load balancing, in order to reduce execution time. Finally, we characterize user-level transparent execution on POWER5 and POWER6, and identify the workload mix that best benefits from it.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 90
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: This paper proposes a novel technique called mRT-PLRU (Multitasking Real-Time constrained combination of Pinning and LRU), which forms a generic framework to use inexpensive nonvolatile NAND flash memory for storing and executing real-time programs in multitasking environments. In order to execute multiple real-time tasks stored in NAND flash memory with the minimal usage of expensive RAM, the mRT-PLRU is optimally configured in two steps. In the first step, the per-task analysis finds the function of RAM size versus execution time (and the corresponding optimal pinning/LRU combination) for each individual task. Using these functions for all the tasks as inputs, the second-step called a stochastic-analysis-in-loop optimization conducts an iterative convex optimization with the stochastic analysis for the probabilistic schedulability check. As a result, the optimization loop can optimally determine the RAM sizes for multiple tasks such that their deadlines are probabilistically guaranteed with the minimal size of total RAM. The usefulness of the developed technique is intensively verified through both simulation and actual implementation. Our experimental study shows that mRT-PLRU can save up to 80 percent of RAM required by the industry-common shadowing approach.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 91
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: Processor fault diagnosis plays an important role in measuring the reliability of multiprocessor systems and diagnosing many well-known interconnection networks. Conditional diagnosability is a novel measure of diagnosability that adds the additional condition that any faulty set cannot contain all of the neighbors of any vertex in a system. This study investigates some topological properties of $(k)$-ary $(n)$-cubes, where $(k ge 4)$ and $(n ge 4)$, and shows that the conditional diagnosability of $(k)$-ary $(n)$-cubes under the comparison diagnosis model is $(6n-5)$.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 92
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: Extreme scale systems available before the end of this decade are expected to have 100 million to 1 billion CPU cores. The probability that a failure occurs during an application execution is expected to be much higher than today's systems. Counteracting this higher failure rate may require a combination of disk-based checkpointing, diskless checkpointing, and algorithmic fault tolerance. Diskless checkpointing is an efficient technique to tolerate a small number of process failures in large parallel and distributed systems. In the literature, a simultaneous failure of no more than $(N)$ processes is often tolerated by using a one-level Reed-Solomon checkpointing scheme for $(N)$ simultaneous process failures, whose overhead often increases quickly as $(N)$ increases. We introduce an $(N)$-level diskless checkpointing scheme that reduces the overhead for tolerating a simultaneous failure of up to $(N)$ processes. Each level is a diskless checkpointing scheme for a simultaneous failure of $(i)$ processes, where $(i=1, 2, ldots, N)$. Simulation results indicate the proposed $(N)$-level diskless checkpointing scheme achieves lower fault tolerance overhead than the one-level Reed-Solomon checkpointing scheme for $(N)$ simultaneous processor failures.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 93
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-03-19
    Beschreibung: We address the problem of information brokerage, where information consumers search for the data acquired by information producers. To the best of our knowledge, there exists no retrieval-guaranteed location-aware information brokerage scheme with a bounded data retrieval path length and bounded replication and retrieval message overhead costs available for use in 3D wireless ad hoc networks to date. In this paper, we propose a novel location-aware information brokerage scheme, termed LAIB, where the network area is divided into cube grids, and data are replicated and retrieved in the hashed geographic location in each grid designated by the producer and the consumer, respectively. In LAIB, a polylogarithmic number of grids are designated by the producer and by the consumer, and at least one grid, whose distance from the grid of the consumer is smaller than the distance from the grid of the consumer to the grid of the producer, is designated by both the producer and the consumer. Simulations show that, as the network area is divided into a moderate number of grids, LAIB has good performance in term of retrieval latency stretch while ensuring moderate replication memory, replication message, and retrieval message overhead costs.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 94
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: The error detecting problem for limited magnitude errors over high radix channels is studied. In this error model, the error magnitude does not exceed a certain limited value and it is known beforehand. For asymmetric, unidirectional, and symmetric channels, both all and $(t)$ error detecting codes are studied. In all these cases, close-to-optimal codes are proposed.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 95
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: Research on compiler techniques for thread-level loop speculation has so far remained on studying its performance limits: loop candidates that are worthy of parallelization are manually selected by the researchers or based on extensive profiling and preexecution. It is therefore difficult to include them in a production compiler for speculative multithreaded multicore processors. In a way, existing techniques are statically adaptive ("realized"; by the researchers for different inputs) yet dynamically greedy (since all iterations of all selected loop candidates are always parallelized at run time). This paper introduces a Statically GrEEdy and Dynamically Adaptive (SEED) approach for thread-level speculation on loops that is quite different from most other existing techniques. SEED relies on the compiler to select and optimize loop candidates greedily (possibly in an input-independent way) and provides a runtime scheduler to schedule loop iterations adaptively. To select loops for parallelization at runtime (subject to program inputs), loop iterations are prioritized in terms of their potential benefits rather than their degree of speculation as in many prior studies. In our current implementation, the benefits of speculative threads are estimated by a simple yet effective cost model. It comprises a mechanism for efficiently tracing the loop nesting structures of the program and a mechanism for predicting the outcome of speculative threads. We have evaluated SEED using a set of SPECint2000 and Olden benchmarks. Compared to existing techniques with a program's loop candidates being ideally selected a priori, SEED can achieve comparable or better performance while aututomating the entire loop candidate selection process.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 96
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: Given the recent progress in the evolution of high-performance computing (HPC) technologies, the research in computational intelligence has entered a new era. In this paper, we present an HPC-based context-aware intelligent text recognition system (ITRS) that serves as the physical layer of machine reading. A parallel computing architecture is adopted that incorporates the HPC technologies with advances in neuromorphic computing models. The algorithm learns from what has been read and, based on the obtained knowledge, it forms anticipations of the word and sentence level context. The information processing flow of the ITRS imitates the function of the neocortex system. It incorporates large number of simple pattern detection modules with advanced information association layer to achieve perception and recognition. Such architecture provides robust performance to images with large noise. The implemented ITRS software is able to process about 16 to 20 scanned pages per second on the 500 trillion floating point operations per second (TFLOPS) Air Force Research Laboratory (AFRL)/Information Directorate (RI) Condor HPC after performance optimization.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 97
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: The tables-and-additions methods for accurate computation of elementary functions are fast in computation speed but require large memory. A memory-efficient method named as the integrated Add-Table Lookup-Add (iATA) is proposed in this paper. In iATA, the mathematical formulation for computing the elementary functions is derived without using the central difference formulation to save memory. Three additional techniques, specifically the carry select technique, symmetry property exploitation and unequal partitioning of input with the aid of error analysis, are integrated in iATA to further reduce the memory size. The experimental results show that the proposed method is able to achieve higher memory efficiency than the best existing tables-and-additions methods. For the reciprocal and the natural logarithm function, iATA saves 23.63 and 61.39 percent of memory when compared to the best existing results obtained, respectively, by the unified Multipartite Table Method [39] and the Symmetric Table Addition Method [37].
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 98
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: Many-core architectures provide an efficient way of harnessing the growing numbers of transistors available. However, energy and latency costs of communication increasingly limit the parallel programs running on these platforms. Existing designs provide a functional communication layer, but not necessarily the most efficient solution. Due to power limitations, efficiency is now a primary concern that motivates us to look again at cache coherence. First, we analyze the communication behavior of parallel applications. The observed sharing patterns reveal considerable locality of shared data accesses between threads with consecutive IDs. This pattern corresponds to strong physical locality between adjacent cores in a chip-multiprocessor (CMP). This paper explores the design of Proximity Coherence: a novel scheme in which L1 load misses are optimistically forwarded to nearby caches via new dedicated links. We exploit these patterns and improve the efficiency of communication. The results show that careful analysis leads to the design of a more efficient coherence protocol. The protocol reduces the latency of load misses by up to 33 percent (17 percent, on average), improving overall execution time by up to 13 percent. Furthermore, it also reduces network-on-chip traffic by 19 percent and energy consumption by up to 30 percent.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 99
    facet.materialart.
    Unbekannt
    Institute of Electrical and Electronics Engineers (IEEE)
    Publikationsdatum: 2013-04-03
    Beschreibung: We consider the problem of least-latency end-to-end routing over adaptively duty-cycled wireless sensor networks. Such networks exhibit a time-dependent feature, where the link cost and transmission latency from one node to other nodes vary constantly in different discrete time moments. We model the problem as the time-dependent Bellman-Ford problem. We show that such networks satisfy the first-in-first-out (FIFO) property, which makes the time-dependent Bellman-Ford problem solvable in polynomial-time. Using the $(beta)$-synchronizer, we propose a fast distributed algorithm to construct all-to-one shortest paths with polynomial message complexity and time complexity. The algorithm determines the shortest paths for all discrete times in a single execution, in contrast with multiple executions needed by previous solutions. We further propose an efficient distributed algorithm for time-dependent shortest path (TDSP) maintenance. The proposed algorithm is loop-free with low message complexity and low space complexity of $(O(maxdeg))$, where $(maxdeg)$ is the maximum degree for all nodes. We discuss a suboptimal implementation of our proposed algorithms that reduces their memory requirement. The performance of our algorithms are experimentally evaluated under diverse network configurations. The results reveal that our algorithms are more efficient than previous solutions in terms of message cost and space cost.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
  • 100
    Publikationsdatum: 2013-04-03
    Beschreibung: Computational grids provide a massive source of processing power, providing the means to support processor intensive applications. The strong burstiness and unpredictability of the available resources raise the need to make applications robust against the dynamics of grid environment. The two main techniques that are most suitable to cope with the dynamic nature of the grid are load balancing and job replication. In this work, we develop a load-balancing algorithm by juxtaposes the strong points of neighbor-based and cluster-based load-balancing methods. We then integrate the proposed load-balancing approach with fault-tolerant scheduling namely MinRC and develop a performance-driven fault-tolerant load-balancing algorithm or PD_MinRC for independent jobs. In order to improve system flexibility, reliability, and save system resource, PD_MinRC employs passive replication scheme. Our main objective is to arrive at job assignments that could achieve minimum response time, maximum resource utilization, and a well-balanced load across all the resources involved in a grid. Experiments were conducted to show the applicability of PD_MinRC. One advantage of our approach is the relatively low overhead and robust performance against resource failures and inaccuracies in performance prediction information.
    Print ISSN: 0018-9340
    Digitale ISSN: 1557-9956
    Thema: Informatik
    Standort Signatur Erwartet Verfügbarkeit
    BibTip Andere fanden auch interessant ...
Schließen ⊗
Diese Webseite nutzt Cookies und das Analyse-Tool Matomo. Weitere Informationen finden Sie hier...