Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning
طراحی راه رفتن لغزنده با صرفه جویی در مصرف انرژی و آسیب دیدگی برای یک ربات مار مانند بر اساس یادگیری تقویتی و یادگیری تقویتی معکوس-2020
Similar to real snakes in nature, the flexible trunks of snake-like robots enhance their movement capabilities and adaptabilities in diverse environments. However, this flexibility corresponds to a complex control task involving highly redundant degrees of freedom, where traditional modelbased methods usually fail to propel the robots energy-efficiently and adaptively to unforeseeable joint damage. In this work, we present an approach for designing an energy-efficient and damagerecovery slithering gait for a snake-like robot using the reinforcement learning (RL) algorithm and the inverse reinforcement learning (IRL) algorithm. Specifically, we first present an RL-based controller for generating locomotion gaits at a wide range of velocities, which is trained using the proximal policy optimization (PPO) algorithm. Then, by taking the RL-based controller as an expert and collecting trajectories from it, we train an IRL-based controller using the adversarial inverse reinforcement learning (AIRL) algorithm. For the purpose of comparison, a traditional parameterized gait controller is presented as the baseline and the parameter sets are optimized using the grid search and Bayesian optimization algorithm. Based on the analysis of the simulation results, we first demonstrate that this RL-based controller exhibits very natural and adaptive movements, which are also substantially more energy-efficient than the gaits generated by the parameterized controller. We then demonstrate that the IRL-based controller cannot only exhibit similar performances as the RL-based controller, but can also recover from the unpredictable damage body joints and still outperform the model-based controller, which has an undamaged body, in terms of energy efficiency.Videos can be viewed at https://videoviewsite.wixsite.com/rlsnake.
Keywords: Snake-like robot | Reinforcement learning | Inverse reinforcement learning | Motion planning | Damage recovery
Entrepreneurship, fear of failure, and economic policy
کارآفرینی ، ترس از شکست و سیاست اقتصادی-2020
The previous literature ﬁnds that self-reported ‘fear of failure’ has a signiﬁcant negative effect on individuals’ choice to become entrepreneurs. We hypothesize this effect is lessened in economies with a larger number of additional, alternative, entrepreneurial opportunities to pursue if a failure occurs. Prior literature also concludes the number of entrepreneurial opportunities is enhanced signiﬁcantly by having policies and institutions consistent with higher levels of economic freedom. We therefore test and conﬁrm that fear of failure hurts the entrepreneurial process less when levels of economic freedom are higher as there are more additional chances for failed entrepreneurs to pursue.
Keywords: Entrepreneurship | Fear of failure | Economic policy | Economic freedom | Business climate
Evaluation of the implementation of a subset of ISO/IEC 29110 Software Implementation process in four teams of undergraduate students of Ecuador. An empirical software engineering experiment
ارزیابی اجرای زیر مجموعه ای از فرآیند اجرای نرم افزار ISO / IEC 29110 در چهار تیم از دانشجویان کارشناسی ارشد اکوادور. یک آزمایش مهندسی نرم افزار تجربی-2020
The competitiveness of software development companies depends on their ability to offer software products with quality attributes within approved budget and schedule. Most Very Small Entities (VSEs) that develop software do not see the benefits of implementing software standards. Consequently, they limit their potential to be recognised as quality software development entities. In this study, the authors present results obtained through the application of empirical software engineering in an experiment in which the ISO/IEC TR 29110–5–1–2 “Software engineering – Lifecycle profiles for Very Small Entities (VSEs) – Part 5–1–2: Management and engineering guide: Generic profile group: Basic profile” was used. The guide includes two processes: Project Management (PM) process and Software Implementation (SI) process. The objective of the project was the development of a software product for the scheduling of medical appointments for the Student Wellness Center of a university of Ecuador. Four teams of undergraduate students were involved. Two of them (controlled teams) implemented a subset of the SI process, while the other two (non-controlled teams) had freedom to choose development activities that were subsequently mapped with the activities of the standard. All teams developed the software product using the SCRUM framework within the same timeframe. Although the experiment was focused on the SI process, the teams also used a tailored version of the PM process defined by the professors. The experiment execution encountered several difficulties. For example, the timeframe of six weeks established in the design of the experiment was too short since students worked part time in the project. All the teams experienced this difficulty, especially when they had to construct and test the software components. Overall, the teams that used the ISO/IEC TR 29110–5–1–2 guide achieved better scores in the quality evaluation of their software processes.
Keywords: ISO/IEC 29110 | ISO/IEC 25000 | Software implementation process | Experimentation | Empirical software engineering | Software quality
A smart community energy management scheme considering user dominated demand side response and P2P trading
یک طرح مدیریت انرژی هوشمند جامعه با توجه به تسلط کاربر بر پاسخ طرف تقاضا و تجارت P2P-2020
This paper proposed a Peer-to-Peer (P2P) local community energy pool and a User Dominated Demand Side Response (UDDSR) that can help energy sharing and reduce energy bills of smart community. The proposed UDDSR allows energy users within the community to submit flexible Demand Response (DR) bids to Community Energy Management Scheme (EMS) with flexible start time, stop time and response durations with regard to users’ comfort zones for electric heating systems, electric vehicles and other home appliances, which gives maximum freedom to the DR participants. The scheduling of the DR bids, originally a multiobjective optimization problem (maximize the total flexible demand and the flexible demand in every interval during the whole DR duration), is transferred to a single objective optimization problem (maximize the total demand with penalty for demand imbalance during the whole DR duration) that can significantly decrease the computational complexity. Furthermore, to facilitate efficient energy usage among neighbourhoods, a local energy pool is also proposed to enable the energy trading among users aiming to facilitate the usage of surplus energy within the community. The electricity price of energy pool is determined by the real-time demand/supply ratio, and upper/lower limit for the price is configured to ensure the profitability for all the participants within the pool. To evaluate the performance of proposed UDDSR and local energy pool, comprehensive numerical analysis is conducted. It is found that the energy pool participants without PV can get at least 6.16% savings on electricity bill (when PV penetration level equals to 20%). The energy pool participants with PV can get much better return (at least 13.4% profit increase) on the PV generation compared to the conventional Feed-in-Tariff. If energy users join the UDDSR scheme, the participants can get further return, and the proposed UDDSR can provide a constant load reduction/increase during the every time interval of the whole DR event. If Battery Energy Storage System (BESS) is included in the DR operation, the usage efficiency of customers’ flexible loads can achieve more than 85%.
DNA, the imperfect proof
DNA ، اثبات ناقص-2020
Due to the progress of science and the stakes of inquiry and sentence, the DNA analysis is subject ofa substantial development within the area of criminal procedure. However, the DNA is by no means aperfect evidence and it faces scientific, ethic and legal limits which result in reconsidering the balancebetween the stakes of punishment and the protection of Fundamental Freedoms.
Keywords:DNA (criminology) | Genetic prints | File (genetic prints) | Expertise (genetic)
Nonlinear response spectrum analysis of structures equipped with nonlinear power law viscous dampers
تجزیه و تحلیل طیف پاسخ غیرخطی سازه های مجهز به میراگر چسبناک قانون غیرخطی-2020
Response spectrum analysis is recognized as a reliable and practical method for dynamic analysis of structures subjected to earthquake excitation. However, for structures equipped with nonlinear viscous dampers (NVDs), the classical linear response structural analysis cannot be applied due to the nonlinearity induced by such devices, typically in the form of a power law function of the velocity. In this paper, the nonlinear differential equation governing the dynamic response of a single-degree-of-freedom (SDOF) system equipped with NVD is first converted into a set of surrogate linear differential equations using the perturbation technique. The first linear system (zero order) is excited by the real earthquake acceleration time-history, whereas the other SDOF surrogate systems are subjected to virtual excitations based on the velocity responses resulting from the previous linear equations (in a recursive fashion). By performing response spectrum analysis on each surrogate SDOF linear equation and combining the results, the nonlinear response spectrum is estimated. By using Fourier transform and simplifying the frequency-dependent functions, a convenient method is presented for practical design purposes. The proposed method lends itself to be readily adopted in international codes of practice by establishing an equivalent damping ratio. The proposed method, introduced for a SDOF system, is easily extended to multi-storey building structures equipped with different NVDs, and applied to a two-story and to two six-story building frames. The results obtained through the proposed method are in very good agreement with results obtained by nonlinear time-history analyses for a wide set of parameter combinations.
Keywords: Nonlinear response spectrum analysis | Nonlinear viscous damper | Earthquake engineering | Perturbation technique | Power law damping | Fourier transform | Seismic analysis | Response spectrum method
Influence of institutional economics on firm birth and death: A comparative analysis of hospitality and other industries
تأثیر اقتصاد نهادی بر تولد و مرگ محکم: یک تحلیل مقایسه ای از مهمان نوازی و سایر صنایع-2020
This paper investigates how public policies, such as taxes and regulations inﬂuence ﬁrm formation (birth) and closure (death) in the hospitality and other industries in the United States (US), using an institutional economics approach and the dimensions of the Economic Freedom of North America (EFNA) index. The literature has been scant when it comes to examining the eﬀects of policies of formal institutions on ﬁrms’ birth and death in the hospitality industry, and whether these eﬀects in hospitality diﬀer from those of other sectors. The study uses panel data from government sources and the EFNA dimensions and applies cross-sectional dependence and unit root tests, followed by a panel generalized least square approach for the analysis. Our ﬁndings show that components of economic freedom have varying eﬀects on ﬁrms’ birth and death. The study provides practical contributions for policymakers and managers by improving the understanding of ﬁrm births and deaths in the US.
Keywords: Entrepreneurship | Hospitality industry | Economic | freedom | Institutional economics | Firm birth and death | Public policies
Comparing prison and documentary contexts on the treatment of incest offenders. An interview with Guillaume Massart
مقایسه زندان و زمینه های مستند در مورد درمان مجرمان محارم : مصاحبه ای با Guillaume Massart-2020
To study the effects of penal and cinematographic contexts at Casabianda, a prison without bars located in Corsica, in which 80% of inmates have committed acts of sexual violence upon family members younger than fifteen years old. Method We interviewed Guillaume Massart, director of the film La Liberté, who followed the trajectory of men convicted of incestuous acts for one year. Results The film leads to an exploration of the environment and how it impacts speech. Here, a certain freedom of expression becomes possible for the prisoners through the disruption of the relative muteness brought on by the devastating effect of prison life. Discussion Once the typical prison experiences of confinement and brutal violence are removed, it becomes possible to better understand the weight of individual factors on the possibility, or the impossibility, for the inmates to subjectivize their trajectories. The manner of investigation carried out by this documentary approach allows some prisoners to seize this unexpected opportunity for expression, using language that reveals internal conflicts, taking advantage of the free association allowed by the intimate relationship that develops with the filmmaker. Conclusion The cinematographic and penal contexts seem to enact a reconciliation between geographic space that surrounds the prisoners and their psychic space, so that even the unthinkable of incest does not foreclose the possibility of transformation, with potential outcomes for the psychic treatment of the violence of these criminals.
Keywords : Prison, Documentary film | Sex offenders | Incest | Violence | Therapeutic mediation
Reinforcement learning in dual-arm trajectory planning for a free-floating space robot
یادگیری تقویتی در برنامه ریزی مسیر دو بازو برای یک ربات فضایی شناور آزاد-2020
A free-floating space robot exhibits strong dynamic coupling between the arm and the base, and the resulting position of the end of the arm depends not only on the joint angles but also on the state of the base. Dynamic modeling is complicated for multiple degree of freedom (DOF) manipulators, especially for a space robot with two arms. Therefore, the trajectories are typically planned offline and tracked online. However, this approach is not suitable if the target has relative motion with respect to the servicing space robot. To handle this issue, a model-free reinforcement learning strategy is proposed for training a policy for online trajectory planning without establishing the dynamic and kinematic models of the space robot. The model-free learning algorithm learns a policy that maps states to actions via trial and error in a simulation environment. With the learned policy, which is represented by a feedforward neural network with 2 hidden layers, the space robot can schedule and perform actions quickly and can be implemented for real-time applications. The feasibility of the trained policy is demonstrated for both fixed and moving targets.
Keywords: On-orbit servicing | Free-floating space robot | Dual-arm trajectory planning | Reinforcement learning | Fixed and moving targets
Modeling pedestrian-cyclist interactions in shared space using inverse reinforcement learning
مدلسازی تعاملات عابر پیاده و دوچرخه سوار در فضای مشترک با استفاده از یادگیری تقویتی معکوس-2020
The objective of this study is to model the microscopic behaviour of mixed traffic (cyclistpedestrian) interactions in non-motorized shared spaces. Video data were collected at two locations of Robson Square non-motorized shared space in downtown Vancouver, British Columbia. Trajectories of cyclists and pedestrians involved in interactions were extracted using computer vision algorithms. The extracted trajectories were used to obtain several variables that describe elements of road users’ behaviour including longitudinal and lateral distances, speed and speed differences, interaction angle, and cyclist acceleration and yaw rate. The road users behaviour was modeled as utility-based intelligent rational agents using the finite-state Markov Decision Process (MDP) framework with unknown reward functions. The study implemented Inverse Reinforcement Learning (IRL) using two algorithms: the Maximum Entropy (ME) algorithm, and the Feature Matching (FM) algorithm to recover/estimate the reward function weights of cyclists in two types of interactions with pedestrians: following and overtaking interactions. Reward function weights infer cyclist preferences during their interactions with pedestrians in non-motorized shared spaces, and can form the key component in developing agent based microsimulation model for road users. Furthermore, the estimated reward functions were used to estimate cyclists’ optimal policy for such interactions. A simulation platform was developed using the estimated reward functions and the cyclist optimal policies to simulate cyclist trajectories for the validation dataset. Results show that the Maximum Entropy (ME) IRL algorithm outperformed the Feature Matching (FM) IRL algorithm, and generally provided reasonable results for modeling such interactions in non-motorized shared spaces, considering the high degrees of freedom in movement and the more-complex road users interactions in such facilities. This research is considered an important step toward developing a full Agent-Based Model (ABM) for road users in shared space facilities to evaluate the safety and efficiency of such facilities.
Keywords: Shared space modeling | Overtaking behavior | Following behavior | Simulation | Cyclist and pedestrian | Reward function