面向SLA懲罰成本最小化的多租戶數(shù)據(jù)查詢優(yōu)化研究
發(fā)布時間:2018-04-24 04:16
本文選題:多租戶 + 數(shù)據(jù)管理; 參考:《山東大學(xué)》2016年博士論文
【摘要】:軟件即服務(wù)(SaaS)是云計算的一種重要的應(yīng)用交付形式,被服務(wù)提供商廣泛采用,且已經(jīng)成為中小企業(yè)使用先進軟件技術(shù)的重要渠道。SaaS模式下,成熟的服務(wù)運營商一般采用單實例多租賃的方式,啟動一個應(yīng)用實例為眾多租戶提供有共性的服務(wù),這種應(yīng)用被稱為多租戶應(yīng)用。服務(wù)提供商將多租戶應(yīng)用部署在云中,供租戶以按需付費的方式來租賃這些應(yīng)用。服務(wù)提供商根據(jù)租戶的需求及支付能力,提供不同服務(wù)質(zhì)量的SaaS應(yīng)用以備不同租戶所租賃。租戶為確保獲得穩(wěn)定服務(wù)質(zhì)量,與SaaS提供商簽訂服務(wù)水平協(xié)議(SLA,Service-Level Agreement)。查詢響應(yīng)時間是SLA中重要的性能指標(biāo),若查詢響應(yīng)時間超出了SLA規(guī)定的截止時間,租戶則無法及時得到數(shù)據(jù)查詢結(jié)果,導(dǎo)致較低的SaaS體驗。當(dāng)查詢響應(yīng)時間違背服務(wù)水平目標(biāo)時,服務(wù)提供商需須根據(jù)簽訂的SLA向租戶支付一定的罰金。服務(wù)提供商根據(jù)每個租戶的需求及支付的應(yīng)用租賃費用與租戶簽訂不同等級的SLA。如何有效的進行查詢優(yōu)化,提高查詢效率,滿足不同用戶的SLA,以最小化SLA懲罰成本,已成為服務(wù)提供商關(guān)注的問題。服務(wù)提供商從成本與收益的角度希望用較少的資源成本盡可能滿足所有租戶的查詢SLA,因此多租戶數(shù)據(jù)庫需在租戶間共享查詢處理資源,優(yōu)化資源利用率。共享資源的多租戶數(shù)據(jù)查詢處理結(jié)構(gòu)必然會出現(xiàn)多個租戶查詢爭用資源,進而導(dǎo)致一些租戶查詢違反SLA。為最小化服務(wù)提供商SLA懲罰成本最小化,需在云計算環(huán)境下對SaaS多租戶數(shù)據(jù)查詢進行優(yōu)化,其所面臨的主要問題和挑戰(zhàn)包括:(1)多租戶數(shù)據(jù)處理需要良好的云組織架構(gòu)。多租戶數(shù)據(jù)庫有著租戶數(shù)量多,數(shù)據(jù)量大的特點,同時租戶不斷加入與離開數(shù)據(jù)庫,這就需要依賴云計算平臺來完成多租戶數(shù)據(jù)處理。大量的節(jié)點與數(shù)據(jù)需要良好的數(shù)據(jù)組織、節(jié)點組織及數(shù)據(jù)定位方法,從而為查詢SLA懲罰成本的優(yōu)化提供基礎(chǔ)。然而,目前給出清晰有效的多租戶數(shù)據(jù)云組織架構(gòu)的文獻較少。(2)以租戶為單位的資源分配粒度過大,仍有需進一步優(yōu)化的空間。以租戶的單位的資源的分配較易實現(xiàn),目前多以租戶粒度對SLA懲罰進行優(yōu)化。然而,一個租戶的諸多查詢在懲罰成本、訪問頻率、占用資源量方面也有不同的屬性。因此,需要以查詢?yōu)閱挝粚μ幚碣Y源進行分配與調(diào)度,更加精細的進行查詢優(yōu)化。(3)多租戶應(yīng)用用戶眾多,查詢并發(fā)數(shù)量多,易造成處理的性能瓶頸。特別是在負載較高時,云中眾多節(jié)點負載不平衡,會導(dǎo)致一些查詢無法在截止時間前完成,增加SLA的懲罰成本。云中去中心化的組織結(jié)構(gòu)是避免性能瓶頸較為有效的手段。因此,降低SLA懲罰成本的查詢優(yōu)化需基于云中去中心化的組織結(jié)構(gòu)。(4)當(dāng)查詢處理節(jié)點處于滿負荷運行時,容易造成較多查詢違約。當(dāng)云中各處理節(jié)點配置完畢后,多租戶的數(shù)據(jù)查詢到達率并不穩(wěn)定,當(dāng)處于查詢到達高峰時,各查詢會對有限的處理資源競爭占用。這時若采用開辟新的處理節(jié)點或者租戶數(shù)據(jù)遷移的方式為租戶分配資源,無法敏捷、及時地解決資源爭用問題。因此,需要設(shè)計一種高峰時期的查詢處理應(yīng)急機制,使違約的懲罰最小。本文在云計算環(huán)境下,以服務(wù)提供商的懲罰成本最小化為目標(biāo),結(jié)合租戶數(shù)據(jù)的隔離性、定制性特點,對多租戶的數(shù)據(jù)查詢優(yōu)化的索引、緩存、調(diào)度環(huán)節(jié)展開研究與討論,主要工作和貢獻包括:(1)針對多租戶數(shù)據(jù)處理需要良好的云組織架構(gòu)的問題,建立了一個支持P2P結(jié)構(gòu)的多租戶索引機制,該機制對云中的多租戶數(shù)據(jù)及索引、節(jié)點進行組織,避免了集中式索引的性能瓶頸,同時為后續(xù)基于SLA的查詢處理優(yōu)化提供了良好的數(shù)據(jù)組織基礎(chǔ)。該索引支持租戶查詢對隔離性的需求,即在利用索引獲取數(shù)據(jù)時避免獲取到其他租戶無效數(shù)據(jù)。該機制支持索引項的順序存儲,支持SaaS應(yīng)用常見的比較查詢、范圍查詢。該機制將屬于一個租戶的索引與數(shù)據(jù)集中地存儲在盡可能少的節(jié)點上,避免了租戶查詢處理時大量數(shù)據(jù)傳輸。該機制提供了動態(tài)擴展性,可以利用云計算平臺的伸縮性為無限數(shù)量的租戶提供索引服務(wù)。實驗結(jié)果表明,在租戶數(shù)量與節(jié)點數(shù)量達到一定規(guī)模時,該機制的單點查詢時間與范圍查詢時間比集中式索引分別至少節(jié)省50%與75%,懲罰成本至少可以降低20%。(2)針對資源分配粒度過大問題,建立了一個SLA感知的多租戶數(shù)據(jù)緩存管理機制,在P2P結(jié)構(gòu)下根據(jù)不同租戶查詢的特征及違約懲罰值對多租戶數(shù)據(jù)庫的緩存進行優(yōu)化,降低服務(wù)提供商的懲罰成本。建立了緩存數(shù)據(jù)與查詢懲罰成本的量化關(guān)系,為選取緩存數(shù)據(jù)提供了依據(jù)。該機制為每個節(jié)點生成緩存數(shù)據(jù),能夠較大幅度降低總體懲罰成本?梢暂^高效率完成跨節(jié)點的緩存數(shù)據(jù)調(diào)整。在該機制中,任意節(jié)點都可以迅速完成租戶查詢的分發(fā),使租戶查詢在處理時間最短的節(jié)點上被處理。通過實驗驗證了在云計算平臺上,其懲罰成本比基準(zhǔn)算法至少減少30%。(3)針對當(dāng)查詢處理節(jié)點處于滿負荷運行時,容易造成較多查詢違約的問題,建立了一個最小化SLA懲罰成本的多租戶查詢?nèi)ブ行幕{(diào)度機制,通過確定每個查詢的處理節(jié)點與處理時間,在處理資源緊張的條件下,優(yōu)先保證關(guān)鍵查詢在截止時間前返回,從而達到懲罰成本最小化。該機制根據(jù)租戶查詢的違約懲罰值、截止時間的急迫性,賦予每個租戶查詢一個優(yōu)先級,優(yōu)先級高的租戶查詢會被先處理,從而達到總體懲罰成本最小化。該機制基于P2P結(jié)構(gòu)使每個節(jié)點都參與調(diào)度,避免了調(diào)度的性能瓶頸。改進了租戶查詢等待調(diào)度隊列的數(shù)據(jù)結(jié)構(gòu),可以在大量的租戶查詢快速地完成查詢的查找、插入與刪除操作,提高了調(diào)度的效率。實驗表明,在租戶查詢達到一定數(shù)量時,該調(diào)度機制的懲罰成本至少比基準(zhǔn)方案低50%。該機制將調(diào)度的時間復(fù)雜度從O(N)降低到O(log~2N),實驗表明一個租戶查詢的調(diào)度時間穩(wěn)定在2ms左右,且不隨租戶查詢數(shù)量增加而變化。
[Abstract]:Software as a service (SaaS) is an important form of application delivery of cloud computing, widely used by service providers, and has become an important channel for small and medium-sized enterprises to use advanced software technology in.SaaS mode. Mature service operators generally adopt a single instance and multi lease mode, and start an application example for many tenants to provide the same. Service providers, which are called multi tenant applications. Service providers deploy multi tenant applications in the cloud for tenants to rent these applications in a paid way. The service provider provides SaaS applications with different quality of service based on the tenant's needs and payment capabilities for different tenants. Service quality, and sign the service level protocol with the SaaS provider (SLA, Service-Level Agreement). Query response time is an important performance indicator in SLA. If the query response time exceeds the deadline specified by SLA, the tenant can not get the result of the data query in time, resulting in a lower SaaS experience. When the query response time is contrary to the service level, the query response time is contrary to the service level. When the target is, the service provider must pay a certain fine to the tenant according to the SLA signed. The service provider, according to the needs of each tenant and the application lease cost of each tenant, will sign different levels of SLA. to optimize the query effectively, improve the efficiency of the query, meet the SLA of the different users, and minimize the cost of SLA punishment. From the point of view of cost and benefit, service providers hope to satisfy all tenants' query SLA with less cost and cost, so multi tenant database needs to share query processing resources among tenants and optimize resource utilization. The tenant inquires the contention resource, and then causes some tenants to minimize the penalty cost minimization of the service provider SLA, which needs to optimize the SaaS multi tenant data query in the cloud computing environment. The main problems and challenges facing the SaaS are as follows: (1) multiple rental accounts need good cloud organization structure. The database has a large number of tenants and a large amount of data. At the same time, the tenants continue to join and leave the database. This needs to rely on the cloud computing platform to complete the multi tenant data processing. A large number of nodes and data need good data organization, node organization and data location method, thus providing the basis for the optimization of the query SLA penalty cost. There are few documents to give a clear and effective multi tenant data cloud organization structure at present. (2) the granularity of resource allocation with tenant as a unit is too large and still needs further optimization. The allocation of the resources of the tenant unit is easier to be realized. At present, the SLA punishment is optimized with the granularity of the tenant. However, many of the tenants' inquiries are punished. There are different attributes in the cost of penalty, the frequency of access and the amount of resources occupied. Therefore, it is necessary to allocate and dispatch the processing resources by the query unit, and more meticulous to optimize the query. (3) many tenants have a large number of users with a large number of concurrent queries, and it is easy to cause the performance bottleneck of processing. The load imbalance will lead to some queries that can not be completed before the deadline and increase the penalty cost of SLA. The decentralized organization structure of the cloud is a more effective means to avoid performance bottlenecks. Therefore, the query optimization for reducing the penalty cost of SLA needs to be based on the organization structure of the cloud centralization. (4) when the query processing node is at full load When the processing nodes in the cloud are configured, the arrival rate of the data query is not stable. When the query reaches the peak, the query will compete for the limited processing resources. It is difficult to solve the problem of resource contention in a timely manner. Therefore, it is necessary to design a query processing emergency mechanism at the peak period to minimize the penalty for default. This paper aims at minimizing the penalty cost of service providers in the cloud computing environment, combining the isolation of the tenant data, the customization characteristics, and optimizing the data query of the multi tenant. The main work and contributions are as follows: (1) a multi tenant index mechanism supporting P2P structure is established to solve the problem that multi tenant data processing needs a good cloud organization architecture. This mechanism organizes the multi tenant data and index, nodes in the cloud, avoids the centralized index. Performance bottlenecks provide a good data organization basis for subsequent SLA based query processing optimization. This index supports the demand for isolation by tenant queries, that is, to avoid getting to other tenant invalid data when using the index to obtain data. This mechanism supports sequential storage of index items and supports common comparative queries in SaaS applications. The mechanism will be stored on a tenant's index and data centrally on as few nodes as possible to avoid a large amount of data transmission when the tenant query is processed. The mechanism provides dynamic scalability and can use the scalability of the cloud computing platform for indefinite tenants to provide index services. The experimental results show that the number of tenants is in the tenant number. When the quantity and the number of nodes reach a certain scale, the single point query time and the range query time of the mechanism save at least 50% and 75% respectively than the centralized index. The penalty cost can reduce at least 20%. (2) for the problem of excessive resource allocation granularity, and a SLA aware multi renting data cache management mechanism is established, under the P2P structure, the different data cache management mechanism is different. The characteristics of the tenant query and the penalty of default are optimized for the caching of the multi tenant database to reduce the penalty cost of the service providers. The quantitative relationship between the cached data and the penalty cost is established to provide the basis for the selection of the cached data. This mechanism generates the cached data for each node, which can greatly reduce the overall penalty cost. In this mechanism, any node can quickly complete the distribution of the tenant query in this mechanism, so that the tenant query is processed on the shortest processing time. It is verified by experiments that on the cloud computing platform, the penalty cost is at least 30%. (3) less than the base algorithm for the query processing. When the node is in full load operation, it is easy to cause more query default. A multi tenant query de centralization scheduling mechanism is established to minimize the SLA penalty cost. By determining the processing nodes and processing time of each query, the key query is returned before the deadline. To minimize the penalty cost, the mechanism gives each tenant a priority according to the default penalty value of the tenant query, the urgency of the deadline, and the high priority tenant query will be processed first to minimize the overall penalty cost. This mechanism is based on the P2P structure to make each node participate in scheduling and avoid scheduling. It improves the data structure of the tenant query waiting for the scheduling queue. It can quickly complete the search, insert and delete operations in a large number of tenants, and improve the efficiency of the scheduling. The experiment shows that the penalty cost of the scheduling mechanism is lower than the benchmark scheme at least 50%. when the tenant query reaches a certain number. The time complexity is reduced from O (N) to O (log~2N). The experiment shows that the scheduling time of a tenant query is stable around 2ms, and does not change with the increase of the number of tenant query.
【學(xué)位授予單位】:山東大學(xué)
【學(xué)位級別】:博士
【學(xué)位授予年份】:2016
【分類號】:TP393.09
,
本文編號:1795097
本文鏈接:http://www.sikaile.net/guanlilunwen/ydhl/1795097.html
最近更新
教材專著